Current Position:fig. beginning " AI Answers

How to solve the problem of out of memory when running Qwen3-235B-A22B-Thinking-2507 model locally?

2025-08-20

344

Practical solutions for solving out-of-memory problems

Insufficient memory is a common challenge when running large language models like Qwen3-235B-A22B-Thinking-2507 locally. The following are a variety of effective solutions:

Quantized version with FP8: The model offers a FP8 version (~220.20GB), which reduces memory requirements by nearly 50% compared to the BF16 version (437.91GB), requiring only ~30GB of memory
Adjusting Context Length: the default 256K context consumes a lot of memory, which can be reduced to 32768 tokens to significantly reduce the memory footprint
Using an efficient reasoning framework: vLLM (≥0.8.5) or sglang (≥0.4.6.post1) are recommended, which optimize memory management and inference efficiency
Multi-GPU Parallelism: Distribute the model across multiple GPUs with the tensor-parallel-size parameter
CPU offloading technology: some calculations can be offloaded to system memory using frameworks such as llama.cpp

In practice, it is recommended to first try the following commands to reduce memory requirements:
python -m sglang.launch_server -model-path Qwen/Qwen3-235B-A22B-Thinking-2507 -tp 8 -context-length 32768

This answer comes from the articleQwen3-235B-A22B-Thinking-2507: A large-scale language model to support complex reasoningThe

May not be reproduced without permission:AI productivity tools " How to solve the problem of out of memory when running Qwen3-235B-A22B-Thinking-2507 model locally?

How to solve the problem of out of memory when running Qwen3-235B-A22B-Thinking-2507 model locally?

Practical solutions for solving out-of-memory problems

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

How to solve the problem of out of memory when running Qwen3-235B-A22B-Thinking-2507 model locally?

Practical solutions for solving out-of-memory problems

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool