在移动端如何应用Transformers进行语音识别？

2025-08-23

475

Mobile Adaptation Program

实现移动端语音识别的关键技术路径：

Model streamlining：选用蒸馏版模型如whisper-small

pipeline("automatic-speech-recognition", model="openai/whisper-small")

ONNX转换：导出为移动友好格式

from transformers import convert_graph_to_onnx
convert_graph_to_onnx.convert(model_name, output_path)

streaming：配置Kyutai-STT的chunk_length参数
```
asr = pipeline(..., chunk_length_s=30)
```

实际效果：经过量化的whisper-small模型在iOS设备上可实现200ms延迟的实时转录，模型尺寸仅150MB。