The following components need to be replaced and configured to realize Chinese voice interaction:
- speech recognition: Replace the Whisper model with a version that supports Chinese (e.g.
large-v2or multilingual model), reinstall and specify the model path. - speech synthesis: Replace Kokoro TTS with an open source TTS engine that supports Chinese (such as Edge-TTS or VITS), you need to modify the TTS call interface in the code.
- Language Model Adaptation: If you need Chinese response, you can connect to the cloud API that supports Chinese (e.g. GPT-3.5 Turbo), or load the Chinese fine-tuned version of the gpt-oss model locally.
Note: You need to test the data transfer compatibility between components and adjust parameters such as audio sample rate to ensure coherence.
This answer comes from the articlegpt-oss-space-game: a local voice-interactive space game built using open-source AI modelsThe































