The following technical limitations should be noted when using SongGeneration:
- hardware requirement: The base model requires at least 10GB of GPU memory, and 16GB when using reference audio
- Input Limits: Avoid providing both reference audio and text descriptions, which may affect the quality of the generation
- lyrics format: must be segmented according to the standard structure of [intro-short], [verse], [chorus], etc. Non-lyrics segments should not contain lyrics
- Reference Audio: A 10-second audio clip of the song's chorus is recommended for best results!
- commercial license: Current model is licensed under CC BY-NC 4.0, legal advice required before commercial use
Following these limits ensures the quality and usability of the generated music.
This answer comes from the articleSongGeneration: open-source AI model for generating high-quality music and lyricsThe