How to solve the problem of insufficient video memory during Search-R1 training?

2025-08-27

1.5 K

A variety of technical solutions to cope with the lack of video memory

Search-R1 provides the following solutions to the video memory limitation problem:

LoRA tuning techniques::
- Reduces 70% video memory footprint by fine-tuning only the adapter layer parameters
- modificationstrain_ppo.shhit the nail on the head--use_lora trueparameterization
gradient checkpoint::
- Reducing graphics memory requirements through a time-for-space strategy
- set upgradient_checkpointing=True
Mixed precision training::
- Mixed precision using FP16/FP32
- Enable it in the configuration filefp16: true
batch optimization::
- alignper_device_train_batch_sizeparameters
- It is recommended that the initial value be set to 4 and adjusted according to the video memory.

Emergency Response Program:

Example of A100 with Colab Pro+ (40GB video memory)
Segmentation of network layers using model parallelism
For the Llama3-3B model, the recommended minimum configuration is 24GB of video memory

Note: This can be done bynvidia-smicommand to monitor the video memory usage in real time.