Kitten-TTS-Server: a self-deployable lightweight text-to-speech service
Kitten-TTS-Server is an open source project that provides a feature-enhanced server for the lightweight KittenTTS model . Users can use this project to build their own text-to-speech (TTS) service . The core advantage of this project is that it is based on the original model , adding a ...
FineShare: an authoring tool for generating AI speech and music
FineShare is a platform focused on AI audio and video technology, offering a variety of tools to help users create high-quality voice, music and video content. The site's core products include FineVoice, Singify, and FineCam for speech generation and conversion, AI music creation, and virtual camera...
CyberSmart: Converting Text to Speech and Digital Human Video
Xunfei Zhizuo is a platform developed by Xunfei to provide artificial intelligence content creation services. Its core function is to convert user-entered text into speech, a process often referred to as "AI dubbing" or "speech synthesis". Users can choose from a variety of preset virtual voices (i.e. "anchors")...
Parrot TTS: a reading tool that turns web text into natural speech
Parrot TTS is a Chrome extension designed to convert web text into natural speech. It uses advanced AI technology to provide a near-human voice experience, solving the problem of traditional text-to-speech tools sounding mechanical. Users can convert articles, news or research materials in one click...
Wavel AI: A Tool for Rapidly Generating Multilingual Video Dubbing and Subtitling
Wavel AI is an AI-based platform focused on helping users quickly create and localize video content. It makes it easy for users to create multilingual video and audio content through features such as voice cloning, text-to-speech and automatic subtitle generation. The platform supports over 70 languages and offers more than 1000...
AIVocal: a free AI tool for generating podcasts and processing audio
AIVocal is a free AI audio processing platform that provides Text-to-Speech (TTS), Speech-to-Text (STT), Human Voice Separation and Podcast Generation. Users can use it without registration, and it supports 24 languages and more than 900 natural tones, which is suitable for producing podcasts, audiobooks, video dubbing and so on....
Dia: text-to-speech modeling for generating hyper-realistic multiplayer conversations
Dia is an open source text-to-speech (TTS) model developed by Nari Labs that focuses on generating hyper-realistic conversational audio. It transforms text scripts into realistic multi-character dialog in a single process, supports emotion and intonation control, and even generates non-verbal expressions such as laughter.At the heart of Dia ...
MiniMax Audio (Conch Speech): AI tool for generating natural speech
MiniMax Audio is an AI speech generation tool from MiniMax, with the core feature of quickly converting text into highly similar natural speech. It is based on the Speech-02 model, with a speech synthesis similarity of up to 99%, studio-grade sound quality, and support for more than 30 languages and a wide range of mouth...
Text2Voice: A Text-to-Speech Graphical Interface Based on Silicon Flow APIs
Text2Voice is an open source tool that provides text-to-speech functionality based on a silicon-based mobility API, best characterized by a clean graphical user interface (GUI). It was created by developer Sheldon Lee on GitHub to allow users to easily turn text into speech through an interface. The project uses Py...
Open source operational project integrating multiple advanced speech synthesis services
Open-VoiceCanvas is an open source speech synthesis platform developed by the ItusiAI team. It supports more than 50 languages, and can convert text to natural speech, as well as clone personalized voices by uploading audio. The project integrates OpenAI TTS, AWS Polly and MiniM...
Mureka: AI-generated original music tool launched by Kunlun Wanwei
Mureka is an AI music generation platform built by Chinese company Kunlun World Wide, which went live in August 2024 and quickly gained attention overseas due to its excellent sound quality and simple operation.On March 26, 2025, Mureka launched the world's first music inference macromodel, Mureka O1, and pedestal model, Mureka V6.This...
csm-mlx: csm speech generation model for Apple devices
csm-mlx is based on the MLX framework developed by Apple, optimized for the CSM (Conversation Speech Model) voice conversation model specifically for Apple Silicon. This project allows users to run efficient speech generation on Apple devices in a simple way and...
Autiobooks: convert epub ebooks to m4b audiobooks
Autiobooks is an open source tool designed to help users quickly convert eBooks in .epub format to audiobooks in .m4b format. It uses high quality speech synthesis technology provided by Kokoro to produce natural and smooth audio. The tool was developed by David Nesbitt and follows the MIT ...
PlayHT: an AI tool for generating hyper-realistic speech
PlayHT is an efficient online platform focusing on AI speech generation, helping users quickly convert text into natural, realistic speech. It provides more than 600 AI voices, supports more than 60 languages and diverse accents, and is suitable for a variety of scenarios such as podcast production, educational content, marketing and promotion. Users only need to input...
Spark-TTS: A Text-to-Speech Tool for Generating Natural Speech
Spark-TTS is an open source Text-to-Speech (TTS) tool developed by the SparkAudio team, hosted on GitHub, designed to help users efficiently convert text into natural and smooth speech. It is based on advanced deep learning technology and supports multiple languages and voice styles...
Azure TTS Importer: Integrating speech synthesis services into reading software
TTS Importer is an open source project designed to easily import Azure TTS (Text-to-Speech) speech synthesis services into various reading software. The tool supports several popular reading software, including Read (legado), Love Reader, Source Reader, and more. With TTS Importe...
Kokoro WebGPU: A Text-to-Speech Service for Offline Operation in Browsers
Kokoro WebGPU is a WebGPU version of the Kokoro text-to-speech (TTS) model, provided by WebML Community on the Hugging Face platform. The project utilizes WebGPU technology to enable users to run efficient text-to-speech conversions locally in their browsers.WebG...
Kokoro-ONNX: Efficient Text-to-Speech Tool with Multi-Language and Multi-Voice Support
Kokoro-ONNX is an open source text-to-speech (TTS) tool based on ONNX runtime. Developed by thewh1teagle, the project aims to provide an efficient and fast speech synthesis solution.Kokoro-ONNX supports multiple languages, including English, and is planned to support French, Japanese, Korean and Chinese...
OpenAI Edge TTS: Free text-to-speech API utilizing Edge TTS, compatible with OpenAI formats
OpenAI Edge TTS is an open source project that provides an OpenAI-compatible native text-to-speech (TTS) API.The project uses Microsoft Edge's online text-to-speech service to allow users to generate high-quality speech output.OpenAI Edge TTS supports multiple...
Top