Demucs v4, a milestone release, realizes a breakthrough in track separation technology by integrating the Hybrid Transformer architecture. This technology combines the Transformer's self-attention mechanism with traditional convolutional neural networks to significantly improve feature extraction capabilities and bring separation accuracy to industry-leading levels.
Specific realizations include:
- Supports six-track separation: vocals, drum kit, bass, other backing vocals, guitar and piano
- Model Selection Differences: htdemucs_ft fine-tuned model with highest accuracy, htdemucs_6s supports six-track experimental features
- Processing time optimization: audio processing time is about 1.5 times of the original time, compared with the v3 version of the efficiency increase of 30%
- Graphics memory requirements: recommended more than 3GB of GPU memory, can be optimized through the -segment parameter resource consumption
This version is particularly suitable for professional music production, allowing precise extraction of specific instrument tracks for remix creation.
This answer comes from the articleDemucs: free open source tool for separating music tracksThe