Multi-GPU acceleration configuration method
DiffSynth-Engine supports multi-card acceleration through tensor parallel computing, in major steps:
- Make sure the system has multiple GPUs installed (A100 recommended)
- Add the parallelism parameter to specify the number of GPUs during pipeline initialization
- Enable use_cfg_parallel=True to use parallel computing
Example of actual effect
In Wan 2.1 Video Generation:
- 358 seconds to generate 2 seconds of video on a single card (A100)
- Reduced to 114 seconds with 4 cards in parallel (3.14x acceleration)
Typical configuration code:pipe = WanVideoPipeline.from_pretrained(config, parallelism=4, use_cfg_parallel=True)
caveat
1. The number of GPUs and the parallelism parameter need to match.
2. Acceleration ratio grows non-linearly with the number of GPUs
3. A professional graphics card with 24GB of video memory or more is recommended for best results.
This answer comes from the articleDiffSynth-Engine: Open Source Engine for Low-Existing Deployments of FLUX, Wan 2.1The































