MiMo-7B-RL Deployment Guide
Environmental readiness requirements: Requires Python 3.8+ and more than 14GB of storage; a virtual environment is recommended.
Detailed Steps:
- Creating a Virtual Environment::
python3 -m venv mimo_env
source mimo_env/bin/activate - Installation of the inference engine(Optional):
- vLLM (recommended)::
pip install "vllm @ git+https://github.com/XiaomiMiMo/vllm.git@feat_mimo_mtp_stable_073" - SGLang::
python3 -m pip install "sglang[all] @ git+https://github.com/sgl-project/sglang.git@main#egg=sglang&subdirectory=python"
- vLLM (recommended)::
- Download model::
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "XiaomiMiMo/MiMo-7B-RL"
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True) - Starting services::
python3 -m vllm.entrypoints.api_server --model XiaomiMiMo/MiMo-7B-RL --host 0.0.0.0
take note ofRecommended: NVIDIA A100 40GB GPU, CPU requires at least 32GB of RAM. The first run will automatically download approximately 14GB of model files.
This answer comes from the articleMiMo: A Small Open Source Model for Efficient Mathematical Reasoning and Code GenerationThe































