How to achieve fine-tuning of multilingual reasoning capabilities for the gpt-oss-20b model?

2025-08-19

518

Step-by-step guide to multilingual fine-tuning

The following steps are required to realize multilingual reasoning:

Data preparation: Load HuggingFace multilingual dataset (load_dataset('HuggingFaceH4/Multilingual-Thinking')), the dataset contains English/Spanish/French language samples
LoRA Configuration: SettingsLoraConfig(r=8, lora_alpha=32)Specify adapter parameters to focus on tuningq_projcap (a poem)v_projplane of projection (in perspective drawing)
Model loading: UsePeftModelPacking the original model, keeping the 95% parameters frozen and only fine-tuning the adaptation layer
Training control: Set by TRL librarymax_seq_length=2048cap (a poem)batch_size=4The use of gradient checkpoints saves video memory.
Language Designation: add to the system prompt when reasoning'Reasoning language: Spanish'isometric instruction

A full example can be found in the repositoryfinetune.ipynb, the entire process takes about 6 hours on a single 24GB GPU.