Preparation for installation
First you need to configure your Python 3.9 environment, and it is recommended that you use conda to manage your virtual environment.
Installation steps
- clone warehouse: Run it in the terminal
git clone https://github.com/OpenGVLab/InternVL.gitand enter the catalog - Creating a Virtual Environment: Use
conda create -n internvl python=3.9 -yCreating the Environment - Installation of basic dependencies: Run
pip install -r requirements.txtInstallation of core libraries
Optional Mounting
- mountingFlash-Attentionto accelerate reasoning:
pip install flash-attn==2.3.6 - Install MMDeploy for production deployment:
mim install mmdeploy
Multimodal dialog use
After downloading the model (e.g. InternVL2_5-8B), you can use the following code for multimodal dialog:
from lmdeploy import pipeline
from lmdeploy.vl import load_image
model = 'OpenGVLab/InternVL2_5-8B'
image = load_image('tiger.jpeg')
pipe = pipeline(model)
response = pipe(('描述这张图片', image))
print(response.text)
caveat
The 8B model requires approximately 16GB of GPU memory, and more resources may be needed when processing high-resolution images.
This answer comes from the articleInternVL: Open Source Multimodal Large Model with Image, Video and Text Processing SupportThe































