
Smart Dictation is a powerful macOS app that utilizes advanced artificial intelligence technology to help users easily convert audio recordings into text. The app integrates OpenAI's latest GPT-4o and Whisper models to provide accurate transcription, translation and summarization. Whether you are recording a meeting...

Voquill is an AI tool installed in Chrome. It allows users to use voice input instead of keyboard typing on any website. When you're writing an email, replying to a chat message, or editing a document, you can just speak and Voquill will convert your voice into text in real time. In addition to basic voice dictation, this tool offers a...

Grabcube is a free audio and video processing tool that specializes in video and audio downloads, AI speech to text, subtitle translation and editing. It supports over 1,000 major platforms, including YouTube, Bilibili, Vimeo, etc. and allows users to download video and audio files in multiple formats without restrictions.Grabcub...

Recap is an open source tool designed for macOS to help users quickly record, transcribe and summarize meeting audio. It handles all the data locally without uploading it to the cloud, protecting user privacy. Developer Rawand Ahmad built Recap to address the difficulty of focusing on both discussion and recording in meetings.Re...

Whisper_Cloudflare is an open source project created by developer thun888 and hosted on GitHub.It is based on OpenAI's Whisper model and combines the serverless architecture of Cloudflare Workers to provide highly efficient speech-to-text...

Spokenly is a speech-to-text tool designed for macOS, designed to help users quickly enter text by voice and improve work efficiency. It utilizes advanced AI technologies (such as Whisper and GPT-4o) to convert speech to text in real-time, supports over 100 languages, and is suitable for a variety of scenarios, such as...

OpusLM_7B_Anneal is an open source speech processing model developed by the ESPnet team and hosted on the Hugging Face platform. It focuses on a variety of tasks such as speech recognition, text-to-speech, speech translation and speech enhancement, and is suitable for researchers and developers to experiment and apply in the field of speech processing. The model is based on...

OpenWispr is an open source desktop speech-to-text application based on OpenAI Whisper technology that quickly converts user speech to text. It offers local and cloud processing options, emphasizes privacy protection, and data can be left entirely local. Users can quickly start dictation with global hotkeys and text is automatically pasted to the cursor position, suitable for...

vosk-browser is a speech recognition tool that runs in the browser, built on WebAssembly technology, using the Vosk speech recognition library. It supports processing microphone input or audio files directly in the browser, providing offline speech-to-text functionality without relying on cloud servers. The tool supports English, German...

Any2Text is a free online tool focused on converting audio and video files to text quickly. It utilizes advanced AI speech recognition technology, supports over 100 languages, and is suitable for a variety of scenarios such as meeting recording, podcast transcription and subtitle generation. Users can use it without registration, it's easy to operate, and you can upload files to get high-precision text ending...

Whisper App is a free and open source tool that allows users to record notes by voice and use AI technology to convert the voice to text, generating content such as lists, blogs or tasks. Developed by Nutlope and hosted on GitHub, the project is based on Together.ai's Whisper model...

Voxtral is its first open audio model released on July 15, 2025 by French AI startup Mistral AI. Voxtral aims to provide commercial applications with speech understanding capabilities for production environments out-of-the-box, at a price that is highly competitive in the market. The Voxtral model is available in two versions for production...

SimpleListenJournal is an audio/video to text tool from Baidu that focuses on quickly converting voice or video content to text and provides AI intelligent analysis. Users can upload audio, video or input text to get high-precision transcription results and automatic summarization. The platform supports multiple languages and is suitable for a variety of scenarios such as meeting minutes, course notes, podcast organizing and so on. Boundary ...

Tencent Meeting AI Little Assistant Pro is an intelligent meeting assistance tool launched by Tencent, aiming to improve the efficiency and convenience of online meetings. It analyzes meeting content in real time through artificial intelligence technology, providing personalized reminders, summarizing key information and generating to-do lists to help users focus on the discussion without missing the point.AI Little Assistant Pro supports multi-scenario use, covering...

Flash Notes is a smart note-taking tool launched by Nail, designed to help users quickly record, organize and share information. It supports a variety of recording methods such as voice, text and images, which is suitable for individuals and teams to manage notes efficiently in work, study or life. Flash Notes converts voice to text through intelligent technology and automatically organizes the content to reduce the trouble of manual input. Users can pin...

Kyutai Labs' delayed-streams-modeling project is an open source speech-to-text conversion framework based on Delayed Stream Modeling (DSM) technology at its core. It supports real-time speech-to-text (STT) and text-to-speech (TTS) functions , suitable for building efficient voice interaction applications . The project provides p...

Very Fast Dictation is an open source speech-to-text tool designed for Mac users. It uses fast speech recognition technology to convert what the user says into text in real time, for any scenario that requires text input. The project is hosted on GitHub, developed by developer Avi Aryan, and uses...

Simple Subtitling is an open source audio subtitle generation tool that focuses on automatically generating subtitles and labeling speakers for video or audio files. Developed by Jaesung Huh and hosted on GitHub, the project aims to provide a simple and efficient subtitle generation solution. Tools through audio processing technology, ...

Abogen is an open source tool designed to quickly convert ePub, PDF or plain text files to high quality audio. It uses the Kokoro-82M model to generate natural, smooth speech and supports synchronized subtitle generation, making it suitable for audiobooks, video dubbing or learning aids. Users can choose between multiple languages and male and female...
Top

