How to achieve capability migration for Qwen3 fine-tuning models in cross-language scenarios?

2025-08-28

272

Cross-Language Migration Implementation Program

To realize the extension of the model's multilingual capabilities, it can be advanced in three phases:

Data preparation::
- Constructing a parallel corpus (combinations of Chinese/English/Chinese/Japanese etc. are recommended)
- existdata/Catalog New Creationmultilingual.jsonThe field containslanguage_tag
blended training::
- Keep the original model word list and add it with SFT scripts--lang_loss_weight 0.3parameters
- Recommended mixed multilingual samples within batch (supported by project dataloader)
capability testing::
- Specify during interaction testing--language enParameters such as switching language
- Quantitative assessment using indicators such as BLEU

Note: Smaller size models (1.7B) are recommended to focus on single language pairs, while models above 4B can try joint multi-language training.