The repository provides a complete fine-tuning solution based on the Hugging Face TRL library and LoRA (Low-Rank Adaptation) technology. Users can train the adapters on the q_proj and v_proj modules of the attention layer using the pre-configured LoraConfig (r=8, lora_alpha=32). The accompanying Multilingual-Thinking multilingual dataset supports cross-language reasoning tasks in English, Spanish, and French. The fine-tuning process preserves the raw performance of the base model above 90% while significantly improving task-specific accuracy.
This answer comes from the articleCollection of scripts and tutorials for fine-tuning OpenAI GPT OSS modelsThe