WeClone is an innovative open source project that uses WeChat chat transcripts and voice messages as training data, combined with advanced big language modeling and speech synthesis techniques to create highly personalized digital doppelgangers. The project is able to analyze users' chatting habits to fine-tune the language model, and generate voice clones with a similarity of up to 95% with only 5 seconds of voice samples.
For the technical implementation, WeClone uses ChatGLM3-6B as the base large language model by default, and supports LoRA fine-tuning technology to optimize the model performance. A dedicated model with 0.5B parameters is used for speech cloning. The project also provides a complete data processing flow, including a chat log preprocessing tool that automatically filters sensitive information by default.
The digital doppelganger can eventually be bound to a WeChat bot, enabling automated text and voice replies and providing users with a personalized AI interaction experience on the WeChat platform.
This answer comes from the articleWeClone: training digital doppelgangers with WeChat chats and voicesThe






























