Low-end Hardware Adaptation Program
Optimization strategies for running Qwen 2.5-VL on limited hardware:
- Model Selection::
- 8GB video memory device option 3B model (-model-size 3B)
- Add -quantize bitsandbytes for up to 6GB of video memory.
- parameterization::
- Image processing settings min_pixels=256,max_pixels=768 Limit resolution
- Video analysis using -fps 1 for second frame extraction
- Reduce precision loss with -dtype float16
- system optimization::
- Enabling continuous batching with vLLM on Linux
- Windows/Mac Enabling Virtual Video Memory with the -disk-swap Parameter
- Close other GPU applications to ensure memory exclusivity
- alternative::
- Remote invocation of 72B model through API connection to AliCloud PAI service
- Temporary access to T4/V100 resources using Colab Pro
Tested: 3B quantized version on RTX3060 laptop can achieve: 1) image recognition in 5 seconds 2) 1 minute short video parsing.
This answer comes from the articleQwen2.5-VL: an open source multimodal grand model supporting image-video document parsingThe































