Linly-Talker is an innovative digital human intelligent dialog system developed and open-sourced by the Kedreamix team. The system creates a highly realistic human-computer interaction experience by integrating multimodal technologies that combine Large Language Models (LLMs) with visual models.
Its core technology stack consists mainly of:
- Speech Processing Module: Integration of Whisper and FunASR for Speech Recognition, Microsoft TTS for Speech Synthesis
- Language Understanding Module: Conversation engine based on Linly's large language model
- Vision Generation Module: Implementation of digital human generation using SadTalker, support for face animation synthesis
- Sound Cloning System: Integrated GPT-SoVITS model to support personalized speech cloning
- RTIC: MuseTalk module for low-latency dialog responses
These technologies work together to enable the system to handle complex tasks such as image uploading dialogues, video subtitle generation, and multi-round situational dialogues, and to realize a more natural interaction experience than traditional dialog systems.
This answer comes from the articleLinly-Talker: An Intelligent Dialogue System for Digital People, Combining Big Language Modeling and Visual Modeling for a New Interactive ExperienceThe































