Overseas access: www.kdjingpai.com
Bookmark Us

wukong-robot is an open source Chinese voice conversation robot and smart speaker project, designed to help developers quickly build personalized smart speakers. It supports Chinese speech recognition, speech synthesis and multi-round dialog features , integrated with ChatGPT, Baidu, KDDI and other technologies. The project is designed to be modular, with plug-ins and features that can be freely extended, and is suitable for running on Raspberry Pi, Mac and Linux systems. As of March 2023, wukong-robot has been installed over 13,000 times and woken up over 700,000 times. It may also be the first open source smart speaker project to support brain-computer interaction for tech enthusiasts and developers to customize a personalized voice assistant.

 

Function List

  • Support Chinese speech recognition, integrated with Baidu, KDXF, Ali, Tencent, OpenAI Whisper and many other technologies.
  • Provides speech synthesis (TTS), supports VITS sound cloning, Microsoft Edge, and other technologies.
  • Support multi-round dialog, access to ChatGPT, Turing bot and other online dialog systems.
  • Modular design, functional plug-ins independently maintained, developers can easily develop and integrate new plug-ins.
  • It supports offline waking, using the Porcupine and Snowboy engines, as well as innovative waking methods such as brain-computer interaction and line-board shaking.
  • Provides a back-end administration side that supports remote conversations, configuration changes and log viewing.
  • Supports hardware platforms such as Raspberry Pi, easy to install and easy to maintain code.
  • It can be interfaced with smart home systems to control home appliances by voice.

Using Help

Installation process

wukong-robot provides a variety of installation methods, suitable for Raspberry Pi, Mac and Linux systems. The following is an example of the Docker installation process on a Raspberry Pi:

  1. Preparing the environment: Ensure that Docker is installed on the device. run the following command to install Docker:
    sudo apt update
    sudo apt install docker.io
    sudo systemctl start docker
    sudo systemctl enable docker
    

  1. Clone the installation script: Get the Raspberry Pi install script for wukong-robot from GitHub:
    git clone https://github.com/wzpan/wukong-robot-pi-installer.git
    cd wukong-robot-pi-installer
    sudo chmod +x pi_installer
    
  2. Run the installation script: Execute the install command and the script will automatically pull the wukong-robot Docker image and configure the environment:
    sudo ./pi_installer
    
  3. Configuration environment: The first time you run it, you will be prompted to create a configuration file in the user directory. Enter y establish ~/.wukong/config.yml. The configuration file contains speech recognition, synthesis and plug-in settings. It is recommended to refer to the default configuration file default.yml Make changes, but not directly default.yml, so as not to be overwritten by subsequent updates:
    cp static/default.yml ~/.wukong/config.yml
    nano ~/.wukong/config.yml
    
  4. Start wukong-robot: Run the following command in the project root directory to start it:
    python3 wukong.py
    
  5. Installation of plug-ins: wukong-robot supports third-party plugins, you need to clone the plugin repository and install the dependencies separately:
    cd ~/.wukong
    git clone https://github.com/wzpan/wukong-contrib.git contrib
    pip3 install -r contrib/requirements.txt
    

Main Functions

  • awaken by voice: The default wake word is snowboy, which can be customized through the configuration file. After waking up, the system enters recording mode and waits for the user's voice command. For example, saying "snowboy, turn on the light" can trigger the smart home plug-in.
  • Voice Interaction ProcessThe user's voice is converted into text by ASR (speech recognition), parsed by NLU (natural language understanding) and then processed by the matching plug-in, and finally the result is output by TTS (speech synthesis). For example, if you say "What's the weather like today", the weather plug-in will return a voice reply.
  • Backend Management: Upon startup, access the http://<设备IP>:5001 Enter the management interface to remotely send commands, view logs, or modify configurations. For example, adjusting the wake word or switching the voice recognition engine.
  • Plug-in use: Plug-ins such as Echo.py Repeatable user-entered text, suitable for testing. Installation of third-party plug-ins (e.g. Baidu FM or weather plug-ins) is required in the config.yml Configure the API key in. For example, configure the weather plugin:
    weather:
    enable: true
    key: '心知天气 API Key'
    

Featured Function Operation

  • ChatGPT Integration: By configuring the openai_api_key Access to ChatGPT with multi-round dialog support. Requires the use of the config.yml Set in:
    openai:
    api_key: 'your_openai_api_key'
    model: 'gpt-3.5-turbo'
    

    To use it, say "snowboy, talk to me about AI" to enter multi-round conversation mode.

  • brain-computer interaction: wukong-robot supports waking up by Muse brain-computer device, you need to configure the brain-computer device and enable related plug-ins. After running, it will trigger wakeup by brainwave signal, suitable for experimental applications.
  • Smart Home Control: Controls the appliance via the HASS (Home Assistant) plug-in. Configuration config.yml You can control the device with voice commands such as "Turn on the living room light" after using the HASS parameter in the HASS section.

caveat

  • Ensure that the microphone and speakers are working properly and test the recording and playback functions:
    arecord -d 5 test.wav
    aplay test.wav
    
  • If using a Raspberry Pi, the sound card needs to be properly configured (e.g. ReSpeaker 2 Mics). Refer to .asoundrc Configuration:
    pcm.!default {
    type asym
    playback.pcm { type plug slave.pcm "hw:1,0" }
    capture.pcm { type plug slave.pcm "hw:1,0" }
    }
    ctl.!default { type hw card 1 }
    
  • Check the logs or visit the GitHub Issues page (e.g. #355, #353) for community help if you encounter problems.

application scenario

  1. DIY Smart Speaker
    Developers can use wukong-robot to build personalized smart speakers on the Raspberry Pi with customized wake words and voice interaction for home use or to showcase technology projects.
  2. Smart Home Control
    By accessing the HASS plug-in, wukong-robot can control lights, air conditioners and other devices by voice, making it suitable for creating a smart home hub.
  3. Education and Research
    Students and researchers can learn speech recognition, NLP, and plug-in development with its modular design, suitable for AI teaching or lab projects.
  4. brain-computer interaction experiment
    It supports brain-computer wake-up function, which is suitable for neuroscience or HCI researchers to explore new interaction methods.

QA

  1. How do I resolve the "snowboy API closed" issue?
    Snowboy API is discontinued in December 2020, need to switch to Porcupine or another offline wake engine. Modifications config.yml The recommended way to configure the wakeup engine is to use Porcupine.
  2. Why won't the microphone wake up?
    Check that the sound card is configured correctly by running arecord -l Confirm the microphone device. Make sure that the .asoundrc Configure it correctly, or refer to GitHub Issue #57 to adjust the ReSpeaker settings.
  3. How do I add a new plugin?
    clone (loanword) wukong-contrib repository, install the dependencies and copy the plugin to the ~/.wukong/contrib directory, and in the config.yml Enable it in Baidu FM Plugin. For example, to add the Baidu FM plugin, you need to configure the baidufm Parameters.
  4. How do I secure my API keys?
    Don't put openai_api_key and other sensitive information is printed or written directly to the log. Setting config.yml File permissions are read only by the current user (chmod 600 config.yml). Refer to Issue #317 to fix potential security issues.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish