Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI Answers

Qwen3's training data size and quality build cognitive advantage

2025-08-24 1.5 K
Link directMobile View
qrcode

Scale effects of data engineering innovations

Qwen3 has 36 trillion tokens of pre-training data, twice as much as its predecessor Qwen2.5, covering high-quality content such as STEM, programming, and academic papers. The technical report reveals that its data construction consists of three key phases: basic training with 4K contexts (30 trillion tokens), knowledge-intensive data optimization (5 trillion tokens), and 32K-128K long context extended training. The data sources include PDF document parsing (accuracy 92.3%) and synthetic data generated by the Qwen2.5 series of models, in addition to generic web pages.

Quality improvement measures include:

  • Optimizing Multimodal Text Extraction Using the Qwen2.5-VL Model
  • Generating Millions of Examples of Mathematical Reasoning with Qwen2.5-Math
  • Enhancing Code Data Diversity Based on Qwen2.5-Coder
  • Implementation of a five-tier content security filtering mechanism

Benchmark tests show that the Qwen3-32B base model outperforms the Qwen2.5-72B version on professional reviews such as MATH and HumanEval, validating the decisive impact of data quality on model capability. This data advantage allows even small-scale models (e.g., 4B parameters) to handle tasks that traditionally require 70B parameter-level models.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish