Overseas access: www.kdjingpai.com
Bookmark Us

Spokenly is a speech-to-text tool designed for macOS, designed to help users quickly enter text by voice and increase productivity. It utilizes advanced AI technologies such as Whisper Spokenly is a real-time voice-to-text converter (GPT-4o) that supports more than 100 languages and is suitable for a wide range of scenarios, such as writing, programming, note-taking, etc. Spokenly emphasizes on privacy protection, and provides a local processing mode, so that the voice data doesn't need to be uploaded to the cloud. Users can use shortcut keys to trigger voice input, and text can be inserted directly into the cursor position, making the operation simple and smooth. Whether it's for daily office work or professional creation, Spokenly allows users to say goodbye to tedious typing and focus on content creation.

 

Function List

  • real time speech to text: Activated by shortcut keys, speech is instantly converted to text and inserted at the current cursor position.
  • Multi-language support: Supports more than 100 languages, including English, Spanish, Chinese, etc., with automatic language detection.
  • local processing mode: Using native Whisper models, voice data doesn't leave the device, protecting privacy.
  • Advanced Modeling in the Cloud: Supports cloud-based models such as GPT-4o, providing greater accuracy and speed.
  • Voice Control Mac: Performs operations such as opening applications and searching the web in Agent mode.
  • AI Text Optimization: Automatically correct grammar, format text, and even translate or rewrite content.
  • Transcription History: Save all transcriptions, support search, playback and export.
  • Video File Transcription: Supports direct processing of video files, extracting audio and converting it to text.
  • Customized shortcuts: Users can set a single key or a combination of keys to quickly initiate voice input.

Using Help

Installation process

  1. Download Spokenly: Access Mac App Store Or the official website spokenly.app and click on the download button. The app is only 2.9 MB in size and downloads quickly.
  2. Installation of applications: Once the download is complete, open the installation package and follow the prompts to complete the installation. The app will automatically appear in the macOS menu bar.
  3. Delegation of authoritySpokenly: When Spokenly launches for the first time, you will be prompted to grant access to the microphone and assistive features. Enable Spokenly's microphone access and assistive features by clicking "System Settings > Privacy & Security" to ensure that voice input and cross-application operations work properly.
  4. Setting Shortcuts: Open Spokenly and enter the Settings screen, the default shortcut is Right Command key (⌘). Users can customize it to a single key such as F15 or a combination of keys to ensure that it does not conflict with other applications.

Usage

1. Real-time speech-to-text

  • start up transcription: Place the cursor in any text input box (e.g. browser, email, code editor) and press the set shortcut key (default right Command key). The screen will pop up the transcription window.
  • Start talking.: Speak directly into the microphone and Spokenly will convert your voice to text in real time and display it in the window. After speaking, press the shortcut key again and the text is automatically inserted into the cursor position.
  • Select Model: In the Voice Model settings, select either a local Whisper model (privacy first) or a cloud model (e.g., GPT-4o, requires an Internet connection). The local model is suitable for network-less environments, while the cloud model is more accurate.
  • handle punctuation: Cloud models (e.g. Whisper Large v3) support the automatic addition of punctuation. Local Whisper models do not support direct recognition of punctuation, but this can be addressed through AI text optimization. For example, if you set the AI prompt to "Turn 'exclamation mark' to '! to "Hi!" in your voice.

2. Multilingual support and automatic detection

  • Spokenly supports more than 100 languages, including English, Chinese, Spanish and more. There is no need to manually select a language, the app automatically detects the language in which the voice is entered.
  • procedureSelect "Automatic Language Detection" in the settings, and when you start to speak, the system will match the language with the speech content and transcribe it. For example, mixed English and Chinese sentences can be recognized correctly.
  • caveat: Language recognition effectiveness varies by model. Cloud models (such as ElevenLabs Scribe) performs better in multilingual scenarios, native models may be less accurate on rare languages.

3. Voice-controlled Mac (Agent mode)

  • Enabling Agent Mode: Switch to Agent Mode in Settings. This mode turns your voice into commands to control Mac operations.
  • Common commands::
    • "Open Safari": Launches the Safari browser.
    • "Search Google Weather": Search for weather information in your default browser.
    • "Run terminal commands Display system information": Executes terminal commands.
  • Customized commands: Add trigger phrases and actions to the Quick Commands tab. For example, set "Open Lifehacker" as a trigger phrase that links to the Lifehacker website URL and opens the page every time you say "Open Lifehacker".
  • take note of: Complex commands require clear speech to avoid vague expressions. Shortcuts with parameters will be supported in future versions.

4. AI text optimization

  • Setting AI Prompts: Enter custom commands in the AI Prompts settings, such as "Translate text to Spanish" or "Correct grammar and format for official mail".
  • workflow: After recording your voice, select the AI Prompt shortcut and the system will process the transcribed text according to the instruction. For example, if you say "Meeting tomorrow at 9:00" and apply the "Format as formal email" prompt, the output might be "Dear Colleague, the meeting is scheduled for tomorrow at 9:00 a.m.".
  • Applicable Scenarios: Ideal for quickly generating professional documents, translating multilingual content, or optimizing drafts.

5. Transcription history and export

  • View History: Tap "History" in the main interface of the app to view all the transcripts. Support search by keywords.
  • Playback and Export: Select a record and click "Playback" to listen to the original audio, or click "Export" to save it as a text file, compatible with .txt and .doc formats.
  • Manage Storage: The audio and text of the local model is stored on the Mac at the path ~/Library/Spokenly/Transcriptions. The cloud model does not save the audio and only processes it temporarily.

6. Video file transcription

  • Import Video: In version 2.7.3 and above, click on the "File" menu and select Video File (supports MP4, MOV, etc.).
  • transcription process: The application automatically extracts audio and converts it to text, outputs it to a specified text box or saves it as a file. Suitable for subtitle generation or meeting record organization.
  • Performance Tips: Large video files may require more processing time, so a high-performance Mac device is recommended.

caveat

  • network requirement: Local Whisper models do not require a network, cloud models require a stable connection.
  • Equipment Requirements: macOS 12.0 or above, 8GB or more of RAM is recommended to support the local model.
  • Privacy: In local mode, voice data is not uploaded. Cloud mode uses third-party services (e.g. OpenAI, Deepgram), audio is deleted instantly and not stored. Users can check the third-party privacy policy.

application scenario

  1. Quick note taking
    • Scene DescriptionSpokenly: In a meeting or classroom, users need to take quick notes for inspiration or highlights. With Spokenly, press a shortcut key to speak the content and the text instantly appears in the notes app, saving typing time.AI Text Optimization organizes fragmented speech into structured notes.
  2. Programming and Documentation
    • Scene DescriptionSpokenly supports multiple languages for mixed language environments (e.g., English and Chinese), and allows programmers or writers to enter code comments or lengthy articles by voice. Spokenly supports multiple languages for mixed-language environments (e.g., English and Chinese files).
  3. multilingual communication
    • Scene Description: Multinational team members use Spokenly to transcribe multilingual meetings in real time, or translate them into the target language with AI prompts for easy organization of emails or chats.
  4. Accessibility assistance
    • Scene DescriptionSpokenly's highly accurate transcription and customizable commands improve efficiency.

QA

  1. Is Spokenly completely free?
    • Spokenly basic features are free, including local Whisper models and Apple's built-in transcription. Premium models in the cloud (e.g. GPT-4o) may introduce paid subscriptions in the future, but are currently free.
  2. How do you ensure voice data privacy?
    • In Local Mode, data doesn't leave the Mac, and Cloud Mode uses a third-party service that deletes audio as soon as it's processed. Users can enable Local Mode to block network requests.
  3. What languages are supported?
    • Supports more than 100 languages, including English, Chinese, Spanish, and more. Automatic language detection is suitable for mixed multi-language scenarios, with results varying from model to model.
  4. How to handle video file transcription?
    • Select video in the "File" menu, the application extracts audio and converts it to text. Support MP4, MOV format, suitable for subtitle generation or record organization.
  5. Can it be used offline?
    • Local Whisper models support offline transcription, but are slightly less accurate than cloud-based models. Make sure you have enough storage space on your Mac.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish