Step-by-step guide to multilingual fine-tuning
The following steps are required to realize multilingual reasoning:
- Data preparation: Load HuggingFace multilingual dataset (
load_dataset('HuggingFaceH4/Multilingual-Thinking')
), the dataset contains English/Spanish/French language samples - LoRA Configuration: Settings
LoraConfig(r=8, lora_alpha=32)
Specify adapter parameters to focus on tuningq_proj
cap (a poem)v_proj
plane of projection (in perspective drawing) - Model loading: Use
PeftModel
Packing the original model, keeping the 95% parameters frozen and only fine-tuning the adaptation layer - Training control: Set by TRL library
max_seq_length=2048
cap (a poem)batch_size=4
The use of gradient checkpoints saves video memory. - Language Designation: add to the system prompt when reasoning
'Reasoning language: Spanish'
isometric instruction
A full example can be found in the repositoryfinetune.ipynb
, the entire process takes about 6 hours on a single 24GB GPU.
This answer comes from the articleCollection of scripts and tutorials for fine-tuning OpenAI GPT OSS modelsThe