The following steps need to be followed to train a model using the Open R1 project:
- Environment Configuration: First create a Python virtual environment and activate the
conda create -n openr1 python=3.11 conda activate openr1
- Installation of dependencies: Install vLLM and project dependencies
pip install vllm==0.6.6.post1 pip install -e ".[dev]"
- Account Login: Login to Hugging Face and Weights and Biases accounts
huggingface-cli login wandb login
- training model: Training using the provided scripts
- GRPO Training:
python src/open_r1/grpo.py --dataset <dataset_path>
- SFT Training:
python src/open_r1/sft.py --dataset <dataset_path>
- GRPO Training:
Notably, the project supports multi-stage training, which can start with a base model and gradually transition to a reinforcement learning tuning model.
This answer comes from the articleOpen R1: Hugging Face Replicates the Training Process of DeepSeek-R1The































