Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI Answers

Voice cloning feature enables SongGen to mimic specific vocal characteristics

2025-09-05 1.7 K

SongGen integrates advanced voiceprint encoding technology to extract the speaker's tonal characteristics in just 3 seconds of reference audio. The technical implementation of this feature consists of two key components:

  • voiceprint extraction: Extracting speaker embedding vectors using ECAPA-TDNN models
  • feature fusion: Aligning acoustic features with musical content representations in latent space

In practice, the user can choose whether or not to separate the vocal track in the reference audio. When the separate parameter is set to True, the system will first perform the source separation process to ensure the purity of the cloned vocal features.

This technology allows users to sing the generated song in their preferred voice, greatly enhancing the personalization of the creation.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top