Whisper App is a free and open source tool that allows users to record notes by voice and use AI technology to convert the voice to text, generating content such as lists, blogs or tasks. The project is developed by Nutlope, hosted on GitHub, and based on the Together.ai's Whisper and Llama models enable efficient transcription and text processing. the Whisper App is simple to use, with an intuitive interface for users who want to quickly record and organize content. The code is completely open source and can be freely deployed by users, and the data is stored locally with a focus on privacy protection.
Function List
- Voice Recording and Transcription: Record voice through the microphone and quickly convert it to text.
- AI Text Organizer: Convert transcribed text into lists, blogs, or task lists.
- Multi-language support: supports voice transcription in multiple languages, such as English, Chinese, etc.
- Local storage: recordings and text are stored on the user's device to protect privacy.
- Open Source Deployment: Provide complete code to support local or cloud deployment.
- Third-party service integration: Combine Together.ai and Convex to improve AI and database performance.
- Customized output: support for adjusting text formatting, such as list styles or blog structures.
Using Help
Installation process
To use Whisper App, users need to deploy the project locally or in the cloud. Below are the detailed steps:
- Cloning Project Code
Run the following command in the terminal to get the Whisper App code:git clone https://github.com/Nutlope/whisper.git
Go to the project catalog:
cd whisper
- Installation of dependencies
Make sure Node.js is installed (latest LTS version recommended). Run the following command to install the dependencies:npm install
This will install the necessary packages such as Next.js, Vercel AI SDK, etc.
- Configuring Environment Variables
Whisper App uses Clerk for authentication and Convex for database support. The configuration steps are as follows:- Register for a Clerk account (
https://clerk.com
), getCLERK_SECRET_KEY
cap (a poem)NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY
The - In the project root directory, create the
.env.local
File, add:CLERK_SECRET_KEY=your_clerk_secret_key NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
- Login Convex (
https://convex.dev
), create the project, get theCLERK_ISSUER_URL
(e.g.https://some-animal-123.clerk.accounts.dev
). - Add in Convex Dashboard
CLERK_ISSUER_URL
Click "Save".
- Register for a Clerk account (
- Running Projects
Once the configuration is complete, start the development server:npm run dev
The project runs on
http://localhost:3000
. Just open your browser and visit.
Usage
The Whisper App has a simple interface, which is suitable for getting started quickly. Below is a guide to the main features:
1. Recording and transcription
- Visit the Whisper App page and log in to your account using Clerk.
- Click the "Record" button to authorize browser microphone access.
- Start recording and click "Stop" when finished. It is recommended that you record no more than 5 minutes at a time to ensure accurate transcription.
- The system converts speech to text using Together.ai's Whisper model and the results are displayed on the page.
2. Text organization
- Once the transcription is complete, choose the output format (e.g., list, blog, task list).
- Selecting "List" generates a list of entries; selecting "Blog" organizes them into posts with titles.
- Users can edit the text, adjust the content or add details.
- Click "Save" to store the results in your local IndexedDB database.
3. Customization and optimization
- On the Settings screen, adjust the output formatting, such as the bullet points for lists or the paragraph style for blogs.
- Support for text optimization through Llama models, e.g. for grammar correction or translation of languages.
- The target language (e.g. Chinese, English) can be selected in the settings for transcription or translation.
4. Data management and privacy
- The Whisper App stores recordings and text locally in IndexedDB by default and does not upload to the cloud.
- To clear the data: Clear IndexedDB in your browser developer tools or delete the local path.
%APPDATA%\..\Local\com.bradenwong.whispering
(Windows). - The transcription process requires a connection to Together.ai, and it is recommended to ensure a stable network.
caveat
- Internet connection needs to be stable to access Together.ai and Convex services.
- If the microphone does not work, check system permissions (Windows: Settings > Privacy > Microphone; Mac: System Preferences > Security & Privacy > Microphone).
- Projects that rely on external APIs need to check the free amount or subscription status of Together.ai.
- First-time deployments may require debugging environment variables, so we recommend referring to the GitHub documentation.
application scenario
- Organization of meeting records
Users record discussions in meetings and the Whisper App quickly generates minutes or task lists for team collaboration. - Record of study notes
Students record class or lecture audio, which the Whisper App turns into structured notes for easy review and organization. - Blog Content Creation
Content creators input their inspiration by voice, and the Whisper App organizes it into article drafts to improve writing efficiency. - Individual mission planning
Users record daily to-dos, which the Whisper App turns into task lists to help manage time.
QA
- What languages does the Whisper App support?
Based on Together.ai's Whisper model, it supports English, Chinese, Spanish and other languages. A detailed list of support can be found on the Together.ai website. - Do I have to pay to use the Whisper App?
The Whisper App is free and open source. External services (e.g. Together.ai, Convex) may incur fees depending on usage. - How is data privacy protected?
Recordings and transcribed text are stored locally at IndexedDB, and audio is sent to Together.ai for transcription only, with no other server storage. - What technical foundation is required for deployment?
Familiarity with basic Node.js and command line operations is sufficient, and the GitHub documentation provides detailed instructions for beginners.