Overseas access: www.kdjingpai.com

Bookmark Us

AI Text-to-Speech

 Submit Website

VibeVoice-1.5B: A Speech Generation Model Supporting Long Audio Multi-Role Conversations from Microsoft
VibeVoice-1.5B is a cutting-edge open-source Text-to-Speech (TTS) model released by Microsoft Research. It is specifically designed for generating expressive, long-form, multi-character dialog audio, such as podcasts or audiobooks. The core innovation of VibeVoice is its use of a 7...
08-27 4.3 K2kudos
Kitten-TTS-Server: a self-deployable lightweight text-to-speech service
Kitten-TTS-Server is an open source project that provides a feature-enhanced server for the lightweight KittenTTS model . Users can use this project to build their own text-to-speech (TTS) service . The core advantage of this project is that it is based on the original model , adding an intuitive web page ...
08-09 3.6 K0kudos
KittenTTS: Lightweight Text-to-Speech Modeling
KittenTTS is an open source text-to-speech (TTS) model focused on lightweight and efficiency. It takes up less than 25MB of storage, has about 15 million parameters, and runs on low-end devices without GPU support.Developed by the KittenML team, KittenTTS offers multiple...
08-06 2.7 K0kudos
OpusLM_7B_Anneal: an efficient unified model for speech recognition and synthesis
OpusLM_7B_Anneal is an open source speech processing model developed by the ESPnet team and hosted on the Hugging Face platform. It focuses on a variety of tasks such as speech recognition, text-to-speech, speech translation and speech enhancement, and is suitable for researchers and developers to experiment and apply in the field of speech processing. The model is based on...
08-01 1.5 K0kudos
MOSS-TTSD: An Open Source Bilingual Dialog Speech Generation Tool
MOSS-TTSD is an open source dialog speech generation model that supports bilingual Chinese and English. It can transform two-person dialog text into natural and expressive speech, suitable for AI podcast production, language research and other scenarios. The model is based on low bit rate coding technology, and supports zero-sample two-person speech cloning and single-shot speech generation up to 960 seconds.MO...
07-31 2.2 K0kudos
FineShare: an authoring tool for generating AI speech and music
FineShare is a platform focused on AI audio and video technology, offering a variety of tools to help users create high-quality voice, music and video content. The site's core products include FineVoice, Singify, and FineCam for speech generation and conversion, AI music creation, and virtual camera...
07-29 1.9 K0kudos
CyberSmart: Converting Text to Speech and Digital Human Video
Xunfei Zhizuo is a platform developed by Xunfei to provide artificial intelligence content creation services. Its core function is to convert user-entered text into speech, a process often referred to as “AI dubbing” or “speech synthesis”. Users can choose from a variety of preset virtual voices (i.e., “anchors”) with different styles, such as newscast...
07-27 2.0 K0kudos
ListenHub: a tool to quickly turn web pages, documents into AI podcasts
ListenHub is a platform that utilizes artificial intelligence technology to quickly turn web pages, documents or user input into podcasts. It supports Chinese and English speech synthesis, and users can generate natural and smooth podcast audio by simply uploading a file, typing a topic or pasting a link. The platform is easy to operate and suitable for mobile use, making it convenient for users to receive during commute, exercise or free time...
07-27 2.6 K0kudos
Higgs Audio: an open source tool for generating high-quality speech and multi-character conversations
Higgs Audio is an open source text-to-speech (TTS) project developed by Boson AI, focused on generating high-quality, emotionally rich speech and multi-character dialog. The project is based on over 10 million hours of audio data training, and supports zero-sample speech cloning, natural dialog generation, and multilingual speech output.Higgs A...
07-25 3.8 K0kudos
Parrot TTS: a reading tool that turns web text into natural speech
Parrot TTS is a Chrome extension designed to convert web text into natural speech. It uses advanced AI technology to provide a near-human voice experience, solving the problem of traditional text-to-speech tools sounding mechanical. Users can convert articles, news or research materials to audio in one click, suitable for multitasking...
07-24 1.6 K0kudos
AIdeaFlow Podcast: A Tool for Quickly Turning Text into Professional Podcast Audio
AIdeaFlow Podcast is an AI-based podcast generation platform that allows users to quickly transform text content into high-quality podcast audio. It supports multiple languages and over 120 unique voices for students, professionals and content creators. Users simply enter text or upload a script, and the platform automatically generates a natural pair of...
07-20 1.4 K0kudos
CosyVoice: Ali open source multilingual cloning and generation tools
CosyVoice is an open source multilingual speech generation model that focuses on high-quality text-to-speech (TTS) technology. It supports speech synthesis in multiple languages, providing features such as zero-sample speech generation, cross-language speech cloning, and fine-grained sentiment control.Cos- yVoice 2.0 compares to the previous version and significantly reduces the 30% to...
07-09 3.2 K0kudos
Qwen-TTS: Speech Synthesis Tool with Chinese Dialect and Bilingual Support
Qwen-TTS is a text-to-speech (TTS) tool developed by the Alibaba Cloud Qwen team and served through the Qwen API. It is trained on a large-scale speech dataset, with a natural and expressive voice output that automatically adjusts intonation, speech rate, and emotion.Qwen-TTS supports Mandarin, English, and ...
07-05 3.8 K0kudos
Kyutai: Speech to text real-time conversion tool
Kyutai Labs' delayed-streams-modeling project is an open source speech-to-text conversion framework based on Delayed Stream Modeling (DSM) technology at its core. It supports real-time speech-to-text (STT) and text-to-speech (TTS) functions , suitable for building efficient voice interaction applications . The project provides p...
07-05 3.6 K1kudos
AIVocal: a free AI tool for generating podcasts and processing audio
AIVocal is a free AI audio processing platform that provides Text-to-Speech (TTS), Speech-to-Text (STT), Human Voice Separation and Podcast Generation. Users can use it without registration, and it supports 24 languages and more than 900 natural tones, which is suitable for producing podcasts, audiobooks, video dubbing, etc. The platform's interface is intuitive, and the operation is... The platform's interface is intuitive and...
06-27 2.5 K0kudos
SuperMaker AI: free authoring tool for generating videos, music and images
SuperMaker AI is a free online authoring platform that helps users quickly generate high-quality video, music, image and voice content. Users can try out the core features without logging in, and the operation is simple enough for individual creators and small teams. The platform uses artificial intelligence technology to transform text, images or creative ideas into professional-grade content, with output results...
06-11 2.7 K0kudos
Muyan-TTS: Personalized Podcast Speech Training and Synthesis
Muyan-TTS is an open source text-to-speech (TTS) model designed for podcasting scenarios. It is pre-trained with over 100,000 hours of podcast audio data and supports zero-sample speech synthesis to generate high-quality natural speech. The model is built on Llama-3.2-3B, and combined with the SoVITS decoder, it provides efficient speech...
05-06 2.9 K0kudos
Kimi-Audio: Open Source Audio Processing and Dialogue Base Modeling
Kimi-Audio is an open source audio base model developed by Moonshot AI that focuses on audio understanding, generation and dialog. It supports a wide range of audio processing tasks such as speech recognition, audio Q&A, and speech emotion recognition. The model has been pre-trained with over 13 million hours of audio data, combined with an innovative hybrid architecture in...
05-05 4.3 K0kudos
Audibit: turning popular tech articles into ready-to-listen audio podcasts
Audibit is an open source project, the core function is to Hacker News, TechCrunch and other popular technology articles automatically turned into audio podcasts, so that users in the commute, fitness, or busy when listening to information through the Web or mobile. The project uses Next.js and React to develop the front-end , combined with ...
05-05 2.1 K0kudos