Mobile Adaptation Program
实现移动端语音识别的关键技术路径:
- Model streamlining:选用蒸馏版模型如whisper-small
pipeline("automatic-speech-recognition", model="openai/whisper-small")
- ONNX转换:导出为移动友好格式
from transformers import convert_graph_to_onnx
convert_graph_to_onnx.convert(model_name, output_path) - streaming:配置Kyutai-STT的chunk_length参数
asr = pipeline(..., chunk_length_s=30)
实际效果:经过量化的whisper-small模型在iOS设备上可实现200ms延迟的实时转录,模型尺寸仅150MB。
This answer comes from the articleTransformers: open source machine learning modeling framework with support for text, image and multimodal tasksThe