
Smart Dictation is a powerful macOS app that utilizes advanced artificial intelligence technology to help users easily convert audio recordings into text. The app integrates OpenAI's latest GPT-4o and Whisper models to provide accurate transcription, translation and summarization. Whether you are recording a meeting...

Voquill is an AI tool installed in Chrome. It allows users to use voice input instead of keyboard typing on any website. When you're writing an email, replying to a chat message, or editing a document, you can just speak and Voquill will convert your voice into text in real time. In addition to basic voice dictation, this tool offers a...

Grabcube is a free audio and video processing tool that specializes in video and audio downloads, AI speech to text, subtitle translation and editing. It supports over 1,000 major platforms, including YouTube, Bilibili, Vimeo, etc. and allows users to download video and audio files in multiple formats without restrictions.Grabcub...

Recap is an open source tool designed for macOS to help users quickly record, transcribe and summarize meeting audio. It handles all the data locally without uploading it to the cloud, protecting user privacy. Developer Rawand Ahmad built Recap to address the difficulty of focusing on both discussion and recording in meetings.Re...

Whisper_Cloudflare is an open source project created by developer thun888 and hosted on GitHub.It is based on OpenAI's Whisper model and combines the serverless architecture of Cloudflare Workers to provide highly efficient speech-to-text...

Spokenly is a speech-to-text tool designed for macOS, designed to help users quickly enter text by voice and improve work efficiency. It utilizes advanced AI technologies (such as Whisper and GPT-4o) to convert speech to text in real-time, supports over 100 languages, and is suitable for a variety of scenarios, such as...

OpusLM_7B_Anneal is an open source speech processing model developed by the ESPnet team and hosted on the Hugging Face platform. It focuses on a variety of tasks such as speech recognition, text-to-speech, speech translation and speech enhancement, and is suitable for researchers and developers to experiment and apply in the field of speech processing. The model is based on...

OpenWispr is an open source desktop speech-to-text application based on OpenAI Whisper technology that quickly converts user speech to text. It offers local and cloud processing options, emphasizes privacy protection, and data can be left entirely local. Users can quickly start dictation with global hotkeys and text is automatically pasted to the cursor position, suitable for...

vosk-browser 是一个在浏览器中运行的语音识别工具,基于 WebAssembly 技术构建,使用 Vosk 语音识别库。它支持在浏览器中直接处理麦克风输入或音频文件,提供离线语音转文字功能,无需依赖云端服务器。该工具支持英语、德语...

Any2Text is a free online tool focused on converting audio and video files to text quickly. It utilizes advanced AI speech recognition technology, supports over 100 languages, and is suitable for a variety of scenarios such as meeting recording, podcast transcription and subtitle generation. Users can use it without registration, it's easy to operate, and you can upload files to get high-precision text ending...

Whisper App is a free and open source tool that allows users to record notes by voice and use AI technology to convert the voice to text, generating content such as lists, blogs or tasks. Developed by Nutlope and hosted on GitHub, the project is based on Together.ai's Whisper model...

Voxtral is its first open audio model released on July 15, 2025 by French AI startup Mistral AI. Voxtral aims to provide commercial applications with speech understanding capabilities for production environments out-of-the-box, at a price that is highly competitive in the market. The Voxtral model is available in two versions for production...

SimpleListenJournal is an audio/video to text tool from Baidu that focuses on quickly converting voice or video content to text and provides AI intelligent analysis. Users can upload audio, video or input text to get high-precision transcription results and automatic summarization. The platform supports multiple languages and is suitable for a variety of scenarios such as meeting minutes, course notes, podcast organizing and so on. Boundary ...

Tencent Meeting AI Little Assistant Pro is an intelligent meeting assistance tool launched by Tencent, aiming to improve the efficiency and convenience of online meetings. It analyzes meeting content in real time through artificial intelligence technology, providing personalized reminders, summarizing key information and generating to-do lists to help users focus on the discussion without missing the point.AI Little Assistant Pro supports multi-scenario use, covering...

Flash Notes is a smart note-taking tool launched by Nail, designed to help users quickly record, organize and share information. It supports a variety of recording methods such as voice, text and images, which is suitable for individuals and teams to manage notes efficiently in work, study or life. Flash Notes converts voice to text through intelligent technology and automatically organizes the content to reduce the trouble of manual input. Users can pin...

Kyutai Labs' delayed-streams-modeling project is an open source speech-to-text conversion framework based on Delayed Stream Modeling (DSM) technology at its core. It supports real-time speech-to-text (STT) and text-to-speech (TTS) functions , suitable for building efficient voice interaction applications . The project provides p...
Very Fast Dictation is an open source speech-to-text tool designed for Mac users. It uses fast speech recognition technology to convert what the user says into text in real time, for any scenario that requires text input. The project is hosted on GitHub, developed by developer Avi Aryan, and uses...

Simple Subtitling 是一个开源的音频字幕生成工具,专注于为视频或音频文件自动生成字幕并标注说话者身份。项目由 Jaesung Huh 开发,托管在 GitHub 上,旨在提供简单高效的字幕生成解决方案。工具通过音频处理技术,...

Abogen 是一个开源工具,专为将 ePub、PDF 或纯文本文件快速转换为高质量音频而设计。它使用 Kokoro-82M 模型生成自然流畅的语音,同时支持同步字幕生成,适合制作有声读物、视频配音或学习辅助材料。用户可以选择多种语言和男女...
Top

