Supported Voice Service Types
- open source model: e.g. Zyphra/Zonos-v0.1-hybrid (requires local GPU resources)
- Business APIs:: OpenAI-compatible services (kokoros.transformrs.org)
- Third-party platforms: DeepInfra, etc. (requires API key)
Configuration method
- Key Setting:
export DEEPINFRA_KEY="你的密钥" - Service Designation:
- Base Command:
--provider=openai-compatible(kokoros.transformrs.org) - Model Selection:
--model=tts-1 - Tone parameters:
--voice=bm_lewis
- Base Command:
- audio output: Can be specified
--audio-format=waviso-format
Note the differences in voice styles and costs for different services, and recommend testing samples before batch generation.
This answer comes from the articleTRV: Rapidly Generate Presentation Videos from Slides/PPTs and Explanatory Notes》































