OpenWispr is an open source desktop speech-to-text application based on OpenAI Whisper technology that quickly converts user speech to text. It offers local and cloud processing options, emphasizes privacy protection, and data can be left entirely local. Users can quickly start dictation with global hotkeys and text is automatically pasted to the cursor position, making it suitable for writing, programming, meeting notes, etc. OpenWispr supports cross-platform operation (macOS, Windows, Linux) and offers a wide range of modeling options, balancing speed and accuracy. Its modern interface and draggable panels enhance the experience, and the community-driven development model allows users to customize freely.
Function List
- Real-time speech to text, automatically pastes the transcribed text to the cursor position.
- Supports local processing, voice data is not uploaded to the cloud to ensure privacy and security.
- Provides cloud processing options for faster transcription via the OpenAI API.
- Global hotkeys (default backquotes)
`
) Quick start/stop dictation. - The dictation panel can be dragged to freely adjust the screen position.
- Supports multiple Whisper models (tiny, base, small, medium, large) to suit different needs.
- Provides agent naming capabilities to personalize AI assistant names, supporting the distinction between command and regular dictation.
- Built-in control panel to manage settings, view transcription history, and configure API keys.
- Use a SQLite database to store transcription history locally for easy viewing and management.
- Cross-platform support, compatible with macOS, Windows and Linux.
- Open source code, under the MIT license, allowing free modification and distribution.
Using Help
Installation process
OpenWispr provides an open source version that requires manual installation for technical users or those who need customization. Below are the detailed steps:
Open Source Edition Installation
- Cloning Code: Access
https://github.com/HeroTools/open-wispr
, run the following command:git clone https://github.com/HeroTools/open-wispr.git cd open-wispr
- Installation of dependencies: Ensure that Node.js 18+ and npm are installed locally, run:
npm install
- Configuration environment(Optional, OpenAI API key required for cloud processing):
- Copy the environment template file:
cp env.example .env
- compiler
.env
file, add the OpenAI API key:OPENAI_API_KEY=your_openai_api_key_here
- Or configure the key via the control panel (operated after launching the application).
- Copy the environment template file:
- Local Processing Configuration(Optional):
- Make sure Python 3.7+ is installed (the program installs it automatically).
- Download Whisper models (tiny, base, small, medium, large) via the control panel.
- running program::
- Development mode (hot reload support):
npm run dev
- Production model:
npm start
- Development mode (hot reload support):
- Verify Installation: After booting, click the system tray icon to open the control panel to check the status, or press the default hotkey
`
Test Dictation.
Building standalone applications (optional)
If you need to generate a standalone executable:
- Run the following command:
npm run pack
- Output Path:
- macOS:
dist/mac-arm64/OpenWispr.app
- Windows:
dist/win-unpacked/OpenWispr.exe
- Linux:
dist/linux-unpacked/open-wispr
- macOS:
- take note of: The first time you run an unsigned app on macOS, you may need to right-click and select "Open" to bypass the security warning.
Permission settings
- microphone authority: Grants OpenWispr microphone access the first time it is run.
- Accessibility permissions (macOS): For the AutoPaste feature, you need to enable OpenWispr in System Settings > Privacy & Security > Accessibility.
- If the permission issue persists, open the Control Panel and click "Fix Permission Issues" to fix it.
Main Functions
real time speech to text
- Launch OpenWispr and the screen displays a small draggable dictation panel.
- Press the global hotkey (default)
`
), the panel displays the recording animation and starts talking. - Press the hotkey again to stop recording, the panel displays the processing animation, and the transcribed text is automatically pasted to the cursor position.
- Drag the panel to any position on the screen for easy multi-window operation.
Selection of treatment
- Open the Control Panel (right-click on the system tray icon > Control Panel).
- Select the processing mode:
- local processing: Download Whisper models (tiny is the fastest, large is the highest quality) without the data leaving the device.
- cloud processing: Enter the OpenAI API key for faster processing, network connection required.
- The mode takes effect immediately after you save the settings.
proxy naming
- Name the AI assistant (e.g. "Jarvis") in the initial setup or in the Control Panel.
- Use agent commands (e.g., "Hey Jarvis, format as list") to trigger AI assist functions.
- Regular dictation does not need to invoke the agent name and records the text directly.
- AI automatically detects commands with regular dictation and removes agent names from the output.
Managing Transcription History
- Open the control panel and click on "History" to view all transcription records.
- Supports copying, deleting or searching for historical transcriptions.
- All records are stored in a local SQLite database with the path in the user data directory.
Customizing Hotkeys
- In the "Settings" section of the control panel, click on the "Hotkey" option.
- Press the new key combination (e.g.
Ctrl+Alt+V
) and save it. - If there is a hotkey conflict, you can always change it to any key.
Featured Function Operation
Local Whisper Processing
- Select "Local Processing" from the control panel.
- The program automatically detects the Python environment and prompts you to install Python 3.11 if it is missing.
- Select the model (tiny/base/small/medium/large) and download it automatically (39MB-1.5GB).
- Ensure that you have enough disk space to use the model offline once it is downloaded.
cloud processing
- Enter a valid OpenAI API key in the control panel.
- Select Cloud Processing Mode and the program processes the speech through the OpenAI Whisper API.
- Check the API key status (the control panel shows "OpenAI API Key present: Yes/No").
drag-and-drop interface
- Click on the top of the Dictation panel and drag it anywhere on the screen.
- If the panel moves off-screen, restarting the app resets the position.
Cross-platform support
- OpenWispr is compatible with macOS 10.15+, Windows 10+ and Linux.
- In any text editor (e.g. VS Code, Notion) or browser, press the hotkey to enter text.
- Ensure that accessibility permissions are enabled to support cross-application auto-pasting.
caveat
- Local processing requires a high performance device (8GB RAM, fast CPU recommended).
- Cloud processing requires a stable network and a valid OpenAI API key.
- probe
DEBUG.md
file to obtain debug logs to resolve operational issues. - If the microphone or paste function does not work, check the system permission settings.
application scenario
- Effective Writing
Writers or content creators can quickly generate the first draft of an article by voice.OpenWispr's global hotkeys and auto-paste features make for smooth typing and are suitable for blogging, reporting, or novel creation. - Programming Notes
Developers can use voice to quickly record code comments or technical documentation. Cross-platform support ensures seamless operation in editors such as VS Code, PyCharm, and more. - proceedings
Students or working professionals can record meetings by voice, local processing mode protects sensitive information, and the history recording function is easy to organize and review. - multilingual transcription
Supports 58 languages (including Chinese, English, Japanese, etc.), suitable for translation workers or international communication scenarios, automatically detects the language or through the.env
Set the preferred language.
QA
- Is OpenWispr completely free?
Yes, OpenWispr is open source and free, under the MIT license. Cloud processing is subject to OpenAI API fees. - What is the difference between local and cloud processing?
Local processing of data does not leave the device, suitable for privacy-sensitive scenarios, requiring higher hardware performance. Cloud processing is faster and requires network and API keys. - How do I resolve hotkey conflicts?
Change the hotkeys in "Settings" of the control panel to support any key combination. - What languages are supported?
Supports 58 languages, including Chinese, English, Spanish, and more. Available in.env
The file sets the preferred language, or uses automatic detection. - How to ensure data security?
Audio is not uploaded to the cloud in local processing mode. Cloud processing relies on the OpenAI Privacy Policy, and API keys are stored securely through the system key manager. - What if the transcribed text is not automatically pasted?
Check that macOS accessibility permissions are enabled, or try manually pasting (Cmd+V
). This can be fixed via "Fix Permission Issues" in the control panel.