Overseas access: www.kdjingpai.com
Bookmark Us

Ichigo is an open source, real-time speech AI project that aims to extend text-based language models with native "listening" capabilities. The project uses early fusion techniques inspired by Meta's Chameleon paper.Ichigo aims to be an open-source data, open-weighted, native-device voice assistant, similar to Siri.The project is open for partners to join in the crowdsourcing of speech datasets.

Ichigo(llama3-s):本地实时语音AI助手,开源版Siri-1

 

Function List

  • Real-time speech recognition: The ability to process and understand user voice input in real time.
  • multicast dialogue capability: Supports multiple rounds of dialog and is able to maintain context in a conversation.
  • noise management: The ability to refuse to process non-speech audio inputs through training improves the user experience.
  • Open source and scalable: The project code and model weights are completely open source and users are free to download and extend them.
  • local deployment: Supports deployment on local devices to protect user privacy.

 

Using Help

Installation process

  1. environmental preparation :
    • Ensure that Python 3.8 or above is installed.
    • Install the necessary dependency libraries:pip install -r requirements.txt
  2. Download model :
    • Use the following command to download the Ichigo model:
      git clone https://github.com/homebrewltd/ichigo.git
      cd ichigo
      pip install -e .
      
  3. Configuring the dataset :
    • Download the required dataset from HuggingFace and set the dataset path in the configuration file.
  4. Launch Demo :
    • Start the local Gradio Demo with the following command:
      python demo.py --use-4bit --use-8bit
      

Usage Process

  1. Starting services :
    • After running the above command, visit the locally provided URL to access Ichigo's Web UI interface.
  2. voice input :
    • In the Web UI interface, click the microphone icon to start recording, and the system will process and display the speech recognition results in real time.
  3. many rounds of dialogue :
    • The system supports multiple rounds of dialog, where the user can continuously input speech and the system will maintain the context to understand and respond.
  4. noise management :
    • The system is trained to recognize and reject the processing of non-speech audio inputs to ensure the accuracy of the recognition results.
  5. Custom extensions :
    • Users can modify the code and model as needed to add new features or improve existing ones.

Detailed Operation Procedure

  1. Download and Installation :
    • Visit Ichigo's GitHub page and follow the installation process to download and install the necessary dependencies and models.
  2. Configuration and startup :
    • According to the configuration file provided by the project, set the dataset path and model parameters to start the local service.
  3. Using the Web UI :
    • Experience Ichigo's real-time speech recognition and multi-round dialog features by performing voice input and interaction through the Web UI interface.
  4. Extension and customization :
    • Understand the architecture and workings of the system based on project documentation and code comments for custom extensions.
0Bookmarked
0kudos
🍐 Duck & Pear AI Article Smart Writer
Selection → Writing → Publishing
Fully automated!
WordPress AI Writing Plugin
500+ content creators are using
🎯Intelligent Selection: Batch generation, say goodbye to exhaustion
🧠retrieval enhancement: networking + knowledge base with depth
Fully automated: Writing → Mapping → Publishing
💎Permanently free: Free version = Paid version, no limitations
🔥 Download the plugin for free now!
✅ Free forever · 🔓 100% Open Source · 🔒 Local storage of data

Recommended

Can't find AI tools? Try here!

Enter keywords.Accessibility to Bing SearchYou can find AI tools on this site quickly.

Top