Note the following when using SongGeneration:
- input prompt: Avoid simultaneous provision of
prompt_audio_path
cap (a poem)descriptions
Otherwise, the quality of generation may be degraded due to conflicts. - lyrics format: Lyrics need to be structurally segmented (e.g.
[verse]
,[chorus]
), non-lyrics segments (such as[intro-short]
) should not contain lyrics. - Reference Audio: It is recommended to use the chorus of the song (10 seconds or less) for optimal musicality.
- hardware requirement: 10GB of GPU memory for the base model and 16GB with reference audio.
This answer comes from the articleSongGeneration: open-source AI model for generating high-quality music and lyricsThe