Auto-Audio-Book's innovation lies in its intelligent character recognition and voice assignment system. It first uses AI models to analyze the text of the novel and accurately differentiate between character dialogues and narration; then it assigns voice features according to specific rules: major characters (with more than 50 lines) are assigned an independent voice, minor characters reuse the voice of the narrator, and unspecified characters are randomly matched with synthesized voices.
This feature is realized through two core technologies: 1) multi-voice support for speech synthesis models such as CosyVoice2-0.5B; and 2) a customized voice mapping configuration system. Users can manually specify the vocal characteristics of the protagonist and narrator, including parameters such as gender, speech rate and pitch. The project documentation provides detailed voice configuration cases, such as the classic combination of a Chinese male protagonist with a female narrator.
Test data shows that the system can process 400 chapters of content per hour in a multi-threaded environment. Compared with the traditional single-voice line TTS system, this multi-character solution improves the dramatic expression of audiobooks significantly, which is especially suitable for the audio conversion of dialog novels, and the generated MP3 files can be directly used for radio drama production or online platform distribution.
This answer comes from the articleTool to automatically crawl novels and generate multi-character audiobooksThe