Description of environmental constraints
HRM relies on CUDA extensions by default, but can be run on AMD/Intel graphics cards using the following scheme:
alternative
- Option 1: CPU mode
- Install the CPU version of PyTorch: pip install torch -cpu
- Modify all .cuda() calls in the code to .cpu()
- Setting environment variables: export CUDA_VISIBLE_DEVICES=-1
Note: Reasoning speed is reduced by a factor of about 10
- Option 2: ROCm conversion
- Installing the ROCm version of PyTorch
- Enable automatic optimization with torch.compile()
- Rewriting the CUDA kernel as HIP code
- Option 3: Cloud Service Agent
- Deployment to Azure ML via ONNX Runtime
- Transforming Models with TensorRT-LLM
performance comparison
| installations | relative velocity | memory footprint |
|---|---|---|
| RTX 4090 | 100% | 8GB |
| AMD MI250 | 85% | 11GB |
| Intel Xeon | 12% | 32GB |
This answer comes from the articleHRM: Hierarchical Reasoning Model for Complex ReasoningThe































