Smart Dictation: an AI audio processing tool that combines transcription, translation and summarization features
Smart Dictation is a powerful macOS app that utilizes advanced AI technology to help users easily convert audio recordings into text. The app integrates OpenAI's latest GPT-4o and Whisper models to provide accurate transcription, translation and summarization services. Whether you are memorizing .....
Voquill: Browser Plugin for Converting Speech to Text
Voquill is an AI tool installed in Chrome. It allows users to use voice input instead of keyboard typing on any website. When you're writing an email, replying to a chat message, or editing a document, you can just speak and Voquill will convert your voice into text in real time. In addition to basic voice listening...
Grabcube: free download video with AI transcription and translation tool
Grabcube is a free audio and video processing tool that specializes in video and audio downloads, AI speech to text, subtitle translation and editing. It supports more than 1,000 mainstream platforms, including YouTube, Bilibili, Vimeo, etc., and allows users to download video and audio files in multiple formats without limitations.Grabcu....
Kitten-TTS-Server: a self-deployable lightweight text-to-speech service
Kitten-TTS-Server is an open source project that provides a feature-enhanced server for the lightweight KittenTTS model . Users can use this project to build their own text-to-speech (TTS) service . The core advantage of this project is that it is based on the original model , adding a ...
AI-Chatbox: Speech-to-Text Intelligent Dialogue Project based on ESP32S3
AI-Chatbox is a voice interaction project based on the ESP32S3 development board. Users talk to the big model (LLM) by voice, the device will convert the voice to text and send it to the big model, after getting the answer, it can be further converted to voice broadcasting. The project is developed using Rust language, integrated with Vosk speech recognition worker...
Whisper on Cloudflare AI: a free tool to convert audio to text and generate subtitles
Whisper_Cloudflare is an open source project created by developer thun888 and hosted on GitHub.It is based on OpenAI's Whisper model and combines the serverless architecture of Cloudflare Workers to provide highly efficient speech-to-text...
Spokenly: a speech-to-text tool for macOS
Spokenly is a speech-to-text tool designed for macOS, designed to help users quickly enter text by voice and improve work efficiency. It utilizes advanced AI technologies (such as Whisper and GPT-4o) to convert speech to text in real-time, supports over 100 languages, and is suitable for a wide range of scenarios. ....
Vibe Musicing: AI music generator (free, online)
Vibe Musicing is a free online AI music generator that allows everyone to quickly create their own original songs without the need for a music foundation. Users can choose the music style, fill in the lyrics, or let AI automatically generate the lyrics to easily customize the melody, rhythm, and atmosphere according to their needs.Vibe Musicing...
AI Song Creator: AI tool to quickly turn text into high-quality original music
AI Song Creator is an online AI music generation platform that allows users to generate professional-quality original music and lyrics in 30-90 seconds by entering a text description or lyrics. The site supports more than 40 music styles, including electronic dance music, Lo-Fi, classical and K-Pop, and is suitable for content creators, tourists ....
OpenWispr: Privacy-First Speech-to-Text Desktop Application
OpenWispr is an open source desktop speech-to-text application based on OpenAI Whisper technology that quickly converts user speech to text. It offers local and cloud processing options, emphasizes privacy protection, and data can be left entirely local. Users can quickly start dictation via global hotkeys, and the text automatically sticks...
TEN: An open source tool for building real-time multimodal speech AI intelligences
TEN Framework is an open source software platform focused on helping developers build real-time, multimodal, low-latency speech AI intelligences. It supports multiple programming languages, including C, C++, Go, Python, JavaScript, and TypeScript.Developers can use the TEN Framework to quickly create speech, vision .....
Zaia Health: the AI voice assistant that monitors and improves health habits
Zaia Health is an Artificial Intelligence health app that centers around a voice assistant called "Zaia". The app is designed to help users focus on and improve their health habits. It acts as a personal health companion through voice interaction, guiding users through sleep, exercise, nutrition and mental...
FineShare: an authoring tool for generating AI speech and music
FineShare is a platform focused on AI audio and video technology, offering a variety of tools to help users create high-quality voice, music and video content. The site's core products include FineVoice, Singify, and FineCam for speech generation and conversion, AI music creation, and virtual camera...
SpleeterGui: Easy Music Track Separation Tool
SpleeterGui is a desktop application for Windows users, based on Spleeter, an open source music separation library developed by Deezer.With a simple graphical interface, it allows the user to separate music files into multiple tracks, such as vocals, drums, bass, etc., without having to use the command line. Users can ...
CyberSmart: Converting Text to Speech and Digital Human Video
Xunfei Zhizuo is a platform developed by Xunfei to provide artificial intelligence content creation services. Its core function is to convert user-entered text into speech, a process often referred to as "AI dubbing" or "speech synthesis". Users can choose from a variety of preset virtual voices (i.e. "anchors")...
Any2Text: Free AI tool for converting audio and video to text
Any2Text is a free online tool focused on converting audio and video files to text quickly. It utilizes advanced AI speech recognition technology, supports over 100 languages, and is suitable for a variety of scenarios such as meeting recording, podcast transcription and subtitle generation. Users don't need to register to use it, and it is easy to operate on...
Parrot TTS: a reading tool that turns web text into natural speech
Parrot TTS is a Chrome extension designed to convert web text into natural speech. It uses advanced AI technology to provide a near-human voice experience, solving the problem of traditional text-to-speech tools sounding mechanical. Users can convert articles, news or research materials in one click...
Wavel AI: A Tool for Rapidly Generating Multilingual Video Dubbing and Subtitling
Wavel AI is an AI-based platform focused on helping users quickly create and localize video content. It makes it easy for users to create multilingual video and audio content through features such as voice cloning, text-to-speech and automatic subtitle generation. The platform supports over 70 languages and offers more than 1000...
wukong-robot: a smart speaker project to create personalized Chinese voice conversations
wukong-robot is an open source Chinese voice conversation robot and smart speaker project, designed to help developers quickly build personalized smart speakers. It supports Chinese speech recognition, speech synthesis and multi-round dialog features , integrated with ChatGPT, Baidu, KDDI and other technologies. The project is designed to be modular,...
Top