Hardware Requirements in Detail
According to the official documentation, LatentSync hardware requirements are categorized intoinferencecap (a poem)trainTwo scenarios:
Basic Reasoning Configuration
- video card: NVIDIA graphics card (CUDA support required) with ≥ 6.8GB of video memory (e.g. RTX 3060)
- systems: Linux or Windows (Windows requires manual script adjustment)
- software environment: Python 3.10 + Git + PyTorch (with CUDA support)
Advanced Training Configuration
- Recommended Graphics Cards: RTX 3090 (24GB) and above
- VGA memory requirements::
- stage1.yaml Configuration: 23GB
- stage2_efficient.yaml Configuration: 20GB (best value for money)
- Full stage2.yaml: 30GB (Pro users)
- storage space10GB+ of space to store models and training data.
*Note: Actual requirements will vary with video resolution (default 256×256) and processing time.
This answer comes from the articleLatentSync: an open source tool for generating lip-synchronized video directly from audioThe