KittenTTS provides basic sound style customization:
- Preset Voice Selection: By
voice
parameters (e.g.male_clear
) Switch between preset voices for different genders and tones, see official documentation for options. - Punctuation control: Although direct adjustment of the pitch/speed of speech parameter is not supported, speech rhythm and pauses can be indirectly affected by punctuation in the text (e.g., commas, exclamation points).
Note that compared to professional TTS models (e.g., XTTS-v2), KittenTTS has a more basic voice control functionality, with the main advantages being light weight and operational efficiency.
This answer comes from the articleKittenTTS: Lightweight Text-to-Speech ModelingThe