Mathematical Reasoning Performance Evaluation
MiMo-7B-RL has shown excellent performance on several international math competition datasets:
Core data set achievements
- AIME 2024: 68.21 TP3T Pass@1 (modeled first round answers correct)
- AIME 2025: 55.41 TP3T Pass@1
- MATH-500:: 95.8% Pass@1
These results suggest that the model is capable of:
- Understand competition-level descriptions of complex math problems
- Perform multi-step logical reasoning and equation solving
- Generate a solution process that conforms to a mathematical specification
Recommendations for use
best practice::
- set up
temperature=0.6Balancing Quality and Diversity of Answers - Problem descriptions should be as clear and complete as possible, and complex problems can be entered in segments
- Suitable for AMC/AIME and other competition training, college math teaching support and other scenarios
Tests have shown its performance to be comparable to larger commercial models such as OpenAI o1-mini.
This answer comes from the articleMiMo: A Small Open Source Model for Efficient Mathematical Reasoning and Code GenerationThe































