Step1X-Edit installation and deployment is mainly divided into three steps: environment preparation, model download and running test:
- base environment: Requires Linux (Ubuntu 20.04+ recommended), Python 3.10+ and CUDA 12.1 toolkit
- Video Memory Requirements: Standard Edition requires 80GB of video memory (NVIDIA H800 class), FP8 Quantized Edition can be reduced to 16GB (adapted for RTX 3090 Ti)
- Installation process::
- Create a conda environment:
conda create -n step1x python=3.10 - Install PyTorch 2.3.1 and dependent libraries
- Optional installation of Flash Attention to accelerate reasoning
- Create a conda environment:
- Model Download: 24.9GB of master model weights, 335MB of VAE models, and Qwen-VL-7B multimodal model from Hugging Face.
For ComfyUI users, model weights can be integrated via a plugin after placing them in a specified directory. It is worth noting that there is a significant difference in video memory consumption between different resolutions: 512×512 requires 42GB of video memory (5 seconds to generate) and 1024×1024 requires 50GB of video memory (22 seconds to generate).
This answer comes from the articleStep1X-Edit: An Open Source Tool for Editing Images with Natural Language InstructionsThe































