{"id":17679,"date":"2025-01-08T16:14:51","date_gmt":"2025-01-08T08:14:51","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=17679"},"modified":"2025-01-08T16:14:51","modified_gmt":"2025-01-08T08:14:51","slug":"dashinfer-vlmduoai","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/en\/dashinfer-vlmduoai\/","title":{"rendered":"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01"},"content":{"rendered":"<h2>\u5f15\u8a00<\/h2>\n<p>DashInfer-VLM\u662f\u4e00\u4e2a\u9488\u5bf9\u4e8e\u89c6\u89c9\u591a\u6a21\u6001\u5927\u6a21\u578bVLM\u7684\u63a8\u7406\u67b6\u6784\uff0c\u7279\u522b\u4f18\u5316\u4e86Qwen VL\u6a21\u578b\u7684\u63a8\u7406\u52a0\u901f\uff0cDashInfer-VLM\u548c\u5176\u4ed6\u7684VLM\u7684\u63a8\u7406\u52a0\u901f\u6846\u67b6\u6700\u5927\u7684\u533a\u522b\u662f\uff0c \u5b83\u628aVIT\u90e8\u5206\u548cLLM\u90e8\u5206\u8fdb\u884c\u4e86\u5206\u79bb\uff0c\u5e76\u4e14VIT\u548cLLM\u7684\u8fd0\u884c\u662f\u5e76\u884c\u8fd0\u884c\uff0c\u4e0d\u4e92\u76f8\u5e72\u6270\u3002<\/p>\n<p>\u8fd9\u6837\u505a\u7684\u7279\u70b9\u662f\uff0c\u5728VLM\u4e2d\u7684\u56fe\u7247\uff0c\u89c6\u9891\u9884\u5904\u7406\uff0c\u4ee5\u53caVIT\u7684\u7279\u5f81\u62bd\u53d6\u90e8\u5206\uff0c\u4e0d\u4f1a\u6253\u65adLLM\u7684\u751f\u6210\uff0c\u4e5f\u53ef\u4ee5\u6210\u4e3aVIT\/LLM\u5206\u79bb\u7684\u67b6\u6784\uff0c\u662f\u76ee\u524d\u5f00\u6e90\u793e\u533a\u9996\u4e2a\u4f7f\u7528\u8be5\u67b6\u6784\u7684VLM \u670d\u52a1\u6846\u67b6\u3002<\/p>\n<p>\u5728\u591a\u5361\u90e8\u7f72\u4e0b\uff0c\u5b83\u5728\u6bcf\u5f20\u5361\u4e0a\u90fd\u6709\u4e00\u4e2aViT\u7684\u5904\u7406\u5355\u5143\uff0c\u8fd9\u6837\u5728\u89c6\u9891\uff0c\u591a\u56fe\u7684\u573a\u666f\u4e0b\uff0c\u6709\u975e\u5e38\u663e\u8457\u7684\u6027\u80fd\u4f18\u52bf\u3002<\/p>\n<p>\u53e6\u5916\uff0cViT\u90e8\u5206\uff0c\u5b83\u652f\u6301\u4e86\u5185\u5b58\u7f13\u5b58\uff0c\u8fd9\u6837\u5728\u591a\u8f6e\u5bf9\u8bdd\u4e0b\uff0c\u4e0d\u9700\u8981\u91cd\u590d\u8ba1\u7b97ViT\u3002<\/p>\n<p>\u4e0b\u9762\u662f\u5b83\u7684\u67b6\u6784\u56fe, \u4ee5\u53ca\u6309\u71674\u5361\u90e8\u520672B\u7684\u8fdb\u884c\u7684\u914d\u7f6e\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17680\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4.png\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"1080\" height=\"878\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4.png 1080w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4-300x244.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4-1024x832.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4-768x624.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bea31f9f920ffe4-15x12.png 15w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>\u67b6\u6784\u56fe\u63cf\u8ff0\u4e86\u6d41\u7a0b\u548c\u67b6\u6784\uff1a<\/p>\n<ul>\n<li>\u5728ViT\u90e8\u5206\uff0c\u53ef\u4ee5\u4f7f\u7528\u5f88\u591a\u63a8\u7406\u5f15\u8d77\u8fdb\u884c\u63a8\u7406\uff0c\u6bd4\u5982TensorRT \u6216\u8005 onnxruntime\uff08\u5728\u6846\u67b6\u5185\u4f1a\u5bf9\u6a21\u578b\u7684ViT\u90e8\u5206\u8fdb\u884connx\u6a21\u578b\u5bfc\u51fa\uff0c\uff09\u76ee\u524d\u6846\u67b6\u5185\u9ed8\u8ba4\u652f\u6301\u4e86TensorRT\u3002<\/li>\n<li>\u5728LLM\u90e8\u5206\uff0c\u4f7f\u7528DashInfer\u8fdb\u884c\u63a8\u7406\u3002<\/li>\n<li>Cache\u90e8\u5206\uff0c\u652f\u6301ViT\u7ed3\u679c Memory Cache\uff0c LLM\u90e8\u5206Prerfix Cache\uff0c LLM \u90e8\u5206\u591a\u6a21\u6001 Prefix Cache\uff08\u9ed8\u8ba4\u672a\u5f00\u542f\uff09<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><strong>\u4ee3\u7801\u5730\u5740\uff1a<\/strong><\/p>\n<p>https:\/\/github.com\/modelscope\/dash-infer<\/p>\n<p><strong>\u6587\u6863\u5730\u5740\uff1a\u00a0<\/strong><\/p>\n<p>https:\/\/dashinfer.readthedocs.io\/en\/latest\/vlm\/vlm_offline_inference_en.html<\/p>\n<p>&nbsp;<\/p>\n<h2>\u6700\u4f73\u5b9e\u8df5<\/h2>\n<p>\u5728\u9b54\u642d\u793e\u533a\u514d\u8d39GPU\u7b97\u529b\u4e0a\u4f53\u9a8cDashInfer\uff1a<\/p>\n<pre>\u9996\u5148\u662fdashinfer-vlm\u548cTensorRT\u7684\u5b89\u88c5\u3002\r\n\r\n# \u9996\u5148\u5b89\u88c5\u6240\u9700\u7684 package\r\nimport os\r\n\r\n# \u4e0b\u8f7d\u5e76\u5b89\u88c5 dashinfer 2.0.0rc2 \u7248\u672c\r\n# \u5982\u679c\u9700\u8981\uff0c\u53ef\u4ee5\u4f7f\u7528 wget \u4e0b\u8f7d\u5e76\u89e3\u538b TensorRT \u5305\r\n# pip \u5b89\u88c5 dashinfer 2.0.0rc2\r\n#!pip install https:\/\/github.com\/modelscope\/dash-infer\/releases\/download\/v2.0.0-rc2\/dashinfer-2.0.0rc2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl\r\n#!wget https:\/\/modelscope.oss-cn-beijing.aliyuncs.com\/releases\/TensorRT-10.6.0.26.Linux.x86_64-gnu.cuda-12.6.tar.gz\r\n#!tar -xvzf TensorRT-10.6.0.26.Linux.x86_64-gnu.cuda-12.6.tar.gz\r\n\r\n# \u4e0b\u8f7d\u5230\u672c\u5730\u5e76\u66ff\u6362\u4e3a modelscope \u5bf9\u5e94\u7684 URL\r\n# \u5b89\u88c5 dashinfer\uff0c\u56e0 package \u8f83\u5927\uff0c\u63a8\u8350\u4e0b\u8f7d\u5230\u672c\u5730\u540e\u5b89\u88c5\r\n#!wget https:\/\/modelscope.oss-cn-beijing.aliyuncs.com\/releases\/dashinfer-2.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl\r\n#!pip install .\/dashinfer-2.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl\r\n\r\n# \u5b89\u88c5 dashinfer vlm\r\n#!pip install dashinfer-vlm\r\n\r\n# \u5b89\u88c5 OpenAI \u5ba2\u6237\u7aef\r\n#!pip install openai==1.56.2\r\n\r\n# \u5b89\u88c5 TensorRT \u7684 Python \u5305\uff0c\u4ece\u4e0b\u8f7d\u7684\u5305\u4e2d\u6253\u5f00\u5b89\u88c5\r\n#!pip install TensorRT-10.6.0.26\/python\/tensorrt-10.6.0-cp310-none-linux_x86_64.whl<\/pre>\n<p>&nbsp;<\/p>\n<p>TensorRT \u9700\u8981\u8fdb\u884c\u73af\u5883\u53d8\u91cf\u914d\u7f6e\uff1a<\/p>\n<pre>import os\r\n\r\n# \u83b7\u53d6 TensorRT \u8fd0\u884c\u65f6\u5e93\u7684\u8def\u5f84\r\ntrt_runtime_path = os.getcwd() + \"\/TensorRT-10.6.0.26\/lib\/\"\r\n\r\n# \u83b7\u53d6\u5f53\u524d\u7684 LD_LIBRARY_PATH \u73af\u5883\u53d8\u91cf\u503c\r\ncurrent_ld_library_path = os.environ.get('LD_LIBRARY_PATH', '')\r\n\r\n# \u5c06\u65b0\u8def\u5f84\u6dfb\u52a0\u5230\u73b0\u6709\u503c\u4e2d\r\nif current_ld_library_path:\r\n# \u5982\u679c LD<\/pre>\n<p>\u73af\u5883\u5b89\u88c5\u5b8c\u6210\uff0c \u542f\u52a8 dashinfer vlm\u5bf9\u6a21\u578b\u8fdb\u884c\u63a8\u7406\uff0c\u5e76\u4e14\u5f62\u6210\u4e00\u4e2a openai\u517c\u5bb9\u7684server\uff0c \u6a21\u578b\u53ef\u4ee5\u6362\u6210 7B, 72B\u7b49\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u9ed8\u8ba4\u4f1a\u4f7f\u7528\u73af\u5883\u91cc\u9762\u6240\u6709\u7684GPU\u663e\u5b58<\/p>\n<pre>!dashinfer_vlm_serve\u00a0--model\u00a0qwen\/Qwen2-VL-2B-Instruct\u00a0--port\u00a08000\u00a0--host\u00a0127.0.0.1<\/pre>\n<p>\u8fd9\u4e2a\u8fc7\u7a0b\u4f1a\u521d\u59cb\u5316DashInfer\uff0c\u4ee5\u53caViT\u7528\u7684\u5916\u90e8\u5f15\u64ce\uff08\u8fd9\u91cc\u662fTensorRT\uff09\uff0c\u5e76\u4e14\u8d77\u4e00\u4e2aopenai\u7684service\u3002<\/p>\n<p>&nbsp;<\/p>\n<p>\u770b\u5230\u8fd9\u4e9b\u65e5\u5fd7\u8868\u793aTRT\u521d\u59cb\u5316\u6210\u529f\uff1a<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17681\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6.png\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"1080\" height=\"139\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6.png 1080w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6-300x39.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6-1024x132.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6-768x99.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/b9ff5b0b84686f6-18x2.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>\u770b\u5230\u8fd9\u4e9b\u65e5\u5fd7\uff0c\u8868\u793aDashInfer\u521d\u59cb\u5316\u6210\u529f\uff1a<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17682\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e.png\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"1080\" height=\"99\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e.png 1080w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e-300x28.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e-1024x94.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e-768x70.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/bd0d015becfc41e-18x2.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>\u770b\u5230\u8fd9\u4e9b\u65e5\u5fd7\uff0c\u8868\u793aopenai\u670d\u52a1\u521d\u59cb\u5316\u6210\u529f\uff1a<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17683\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e.png\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"1080\" height=\"75\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e.png 1080w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e-300x21.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e-1024x71.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e-768x53.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/3bc6bc812b6c37e-18x1.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>\u5230\u8fd9\u91cc\u5168\u90e8\u521d\u59cb\u5316\u6210\u529f\uff0c \u53ef\u4ee5\u6253\u5f00\u53e6\u4e00\u4e2anotebook\u8fdb\u884cclient\u548cbenchmark<\/p>\n<p><strong>Notebook\u5730\u5740\uff1a<\/strong>https:\/\/modelscope.cn\/notebook\/share\/ipynb\/6ea987c5\/vl-start-server.ipynb<\/p>\n<p>&nbsp;<\/p>\n<p><strong>\u56fe\u7247\u7406\u89e3Demo<\/strong><\/p>\n<p>\u5c55\u793a\u4e00\u4e2a\u591a\u5f20\u56fe\u7247\u7684\u56fe\u7247\u7406\u89e3\u7684demo\uff1a<\/p>\n<pre># Install the required OpenAI client version\r\n!pip install openai==1.56.2 # VL support requires a recent OpenAI client.\r\n\r\nfrom openai import OpenAI\r\n\r\n# Initialize the OpenAI client\r\nclient = OpenAI(\r\nbase_url=\"http:\/\/localhost:8000\/v1\", \r\napi_key=\"EMPTY\"\r\n)\r\n\r\n# Prepare the API call for a chat completion\r\nresponse = client.chat.completions.create(\r\nmodel=\"model\",\r\nmessages=[\r\n{\r\n\"role\": \"user\",\r\n\"content\": [\r\n{\"type\": \"text\", \"text\": \"Are these images different?\"},\r\n{\r\n\"type\": \"image_url\",\r\n\"image_url\": {\r\n\"url\": \"https:\/\/farm4.staticflickr.com\/3075\/3168662394_7d7103de7d_z_d.jpg\",\r\n}\r\n},\r\n{\r\n\"type\": \"image_url\",\r\n\"image_url\": {\r\n\"url\": \"https:\/\/farm2.staticflickr.com\/1533\/26541536141_41abe98db3_z_d.jpg\",\r\n}\r\n},\r\n],\r\n}\r\n],\r\nstream=True,\r\nmax_completion_tokens=1024,\r\ntemperature=0.1,\r\n)\r\n\r\n# Process the streamed response\r\nfull_response = \"\"\r\nfor chunk in response:\r\n# Append the delta content to the full response\r\nfull_response += chunk.choices[0].delta.content\r\nprint(\".\", end=\"\") # Print a dot for each chunk received\r\n\r\n# Print the full response\r\nprint(f\"\\nImage: Full Response:\\n{full_response}\")<\/pre>\n<p>&nbsp;<\/p>\n<p><strong>\u89c6\u9891\u7406\u89e3demo<\/strong><\/p>\n<p>\u7531\u4e8eopenai\u6ca1\u6709\u5b9a\u4e49\u6807\u51c6\u7684\u89c6\u9891\u63a5\u53e3\uff0c\u672c\u6587\u63d0\u4f9b\u4e86\u4e00\u4e2avideo_url\u7684type\uff0c \u4f1a\u81ea\u52a8\u8fdb\u884c\u89c6\u9891\u4e0b\u8f7d\uff0c\u62bd\u5e27\uff0c\u5206\u6790\u7684\u5de5\u4f5c\u3002<\/p>\n<pre># video example\r\n!pip install openai==1.56.2 # Ensure the OpenAI client supports video link features.\r\n\r\nfrom openai import OpenAI\r\n\r\n# Initialize the OpenAI client\r\nclient = OpenAI(\r\nbase_url=\"http:\/\/localhost:8000\/v1\",\r\napi_key=\"EMPTY\"\r\n)\r\n\r\n# Create a chat completion request with a video URL\r\nresponse = client.chat.completions.create(\r\nmodel=\"model\",\r\nmessages=[\r\n{\r\n\"role\": \"user\",\r\n\"content\": [\r\n{\r\n\"type\": \"text\",\r\n\"text\": \"Generate a compelling description that I can upload along with the video.\"\r\n},\r\n{\r\n\"type\": \"video_url\",\r\n\"video_url\": {\r\n\"url\": \"https:\/\/cloud.video.taobao.com\/vod\/JCM2awgFE2C2vsACpDESXZ3h5_iQ5yCZCypmjtEs2Ck.mp4\",\r\n\"fps\": 2\r\n}\r\n}\r\n]\r\n}\r\n],\r\nmax_completion_tokens=1024,\r\ntop_p=0.5,\r\ntemperature=0.1,\r\nfrequency_penalty=1.05,\r\nstream=True,\r\n)\r\n\r\n# Process the streaming response\r\nfull_response = \"\"\r\nfor chunk in response:\r\n# Append the delta content from the chunk to the full response\r\nfull_response += chunk.choices[0].delta.content\r\nprint(\".\", end=\"\") # Indicate progress with dots\r\n\r\n# Print the complete response\r\nprint(f\"\\nFull Response: \\n{full_response}\")<\/pre>\n<p>&nbsp;<\/p>\n<p><strong>benchmark<\/strong><\/p>\n<p>\u4f7f\u7528\u4e0a\u9762\u7684\u56fe\u7247\u7406\u89e3example\uff0c\u7b80\u5355\u7684\u505a\u4e00\u4e2a\u591a\u5e76\u53d1\u7684\u6d4b\u8bd5\u8fdb\u884c\u541e\u5410\u6d4b\u8bd5\u3002<\/p>\n<pre># benchmark!pip install openai==1.56.2\r\nimport time\r\nimport concurrent.futures\r\nfrom openai import OpenAI\r\n\r\n# \u521d\u59cb\u5316 OpenAI \u5ba2\u6237\u7aef\r\nclient = OpenAI(\r\nbase_url=\"http:\/\/localhost:8000\/v1\",\r\napi_key=\"EMPTY\"\r\n)\r\n\r\n# \u8bf7\u6c42\u53c2\u6570\r\nmodel = \"model\"\r\nmessages = [\r\n{\r\n\"role\": \"user\",\r\n\"content\": [\r\n{\"type\": \"text\", \"text\": \"Are these images different?\"},\r\n{\r\n\"type\": \"image_url\",\r\n\"image_url\": {\r\n\"url\": \"https:\/\/farm4.staticflickr.com\/3075\/3168662394_7d7103de7d_z_d.jpg\",\r\n}\r\n},\r\n{\r\n\"type\": \"image_url\",\r\n\"image_url\": {\r\n\"url\": \"https:\/\/farm2.staticflickr.com\/1533\/26541536141_41abe98db3_z_d.jpg\",\r\n}\r\n},\r\n],\r\n}\r\n]\r\n\r\n# \u5e76\u53d1\u8bf7\u6c42\u51fd\u6570\r\ndef send_request():\r\nstart_time = time.time()\r\nresponse = client.chat.completions.create(\r\nmodel=model,\r\nmessages=messages,\r\nstream=False,\r\nmax_completion_tokens=1024,\r\ntemperature=0.1,\r\n)\r\nend_time = time.time()\r\nlatency = end_time - start_time\r\nreturn latency\r\n\r\n# \u57fa\u51c6\u6d4b\u8bd5\u51fd\u6570\r\ndef benchmark(num_requests, num_workers):\r\nlatencies = []\r\nstart_time = time.time()\r\n\r\nwith concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor:\r\nfutures = [executor.submit(send_request) for _ in range(num_requests)]\r\nfor future in concurrent.futures.as_completed(futures):\r\nlatencies.append(future.result())\r\n\r\nend_time = time.time()\r\ntotal_time = end_time - start_time\r\nqps = num_requests \/ total_time\r\naverage_latency = sum(latencies) \/ len(latencies)\r\nthroughput = num_requests * 1024 \/ total_time # \u5047\u8bbe\u6bcf\u4e2a\u8bf7\u6c42\u7684\u54cd\u5e94\u5927\u5c0f\u4e3a 1024 \u5b57\u8282\r\n\r\nprint(f\"Total Time: {total_time:.2f} seconds\")\r\nprint(f\"QPS: {qps:.2f}\")\r\nprint(f\"Average Latency: {average_latency:.2f} seconds\")\r\n\r\n# \u4e3b\u7a0b\u5e8f\u5165\u53e3\r\nif __name__ == \"__main__\":\r\nnum_requests = 100 # \u603b\u8bf7\u6c42\u6570\r\nnum_workers = 10 # \u5e76\u53d1\u5de5\u4f5c\u7ebf\u7a0b\u6570\r\nbenchmark(num_requests, num_workers)<\/pre>\n<p>&nbsp;<\/p>\n<p>\u6d4b\u8bd5\u7ed3\u679c\uff1a<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17684\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/830c0850580a865.jpg\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"586\" height=\"100\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/830c0850580a865.jpg 586w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/830c0850580a865-300x51.jpg 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/830c0850580a865-18x3.jpg 18w\" sizes=\"auto, (max-width: 586px) 100vw, 586px\" \/><\/p>\n<p><strong>Notebook\u5730\u5740\uff1a<\/strong>https:\/\/modelscope.cn\/notebook\/share\/ipynb\/5560603a\/vl-test-and-benchmark.ipynb<\/p>\n<p>&nbsp;<\/p>\n<p><strong>\u5168\u9762\u548cvLLM\u7684\u6027\u80fd\u5bf9\u6bd4\uff1a<\/strong><\/p>\n<p>\u4e3a\u4e86\u66f4\u52a0\u5168\u9762\u548c\u51c6\u786e\u7684\u5bf9\u6bd4\u548cvLLM\u7684\u6027\u80fd\uff0c\u6211\u4eec\u5728\u4e0d\u540csize\u7684\u6a21\u578b\u4e0a\u4f7f\u7528 OpenGVLab\/InternVL-Chat-V1-2-SFT-Data \u8fdb\u884c\u4e86\u5355\u5e76\u53d1\uff0c\u591a\u5e76\u53d1\uff0c\u4ee5\u53ca\u591a\u8f6e\u5bf9\u8bdd\u7684benchmark\uff0c\u8be6\u7ec6\u7684\u590d\u73b0\u811a\u672c\u89c1\u94fe\u63a5\uff0c \u7ed3\u679c\u5982\u4e0b\uff1a<\/p>\n<p>\u53ef\u4ee5\u770b\u5230DashInfer\u5728\u5404\u4e2a\u60c5\u51b5\u4e0b\u5747\u6709\u4e00\u5b9a\u7684\u6027\u80fd\u4f18\u52bf\uff0c\u5c24\u5176\u5728\u591a\u8f6e\u5bf9\u8bdd\u4e2d\u4f18\u52bf\u66f4\u52a0\u660e\u663e\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17685\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476.png\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"1080\" height=\"375\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476.png 1080w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476-300x104.png 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476-1024x356.png 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476-768x267.png 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/52b67256c354476-18x6.png 18w\" sizes=\"auto, (max-width: 1080px) 100vw, 1080px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-17686\" title=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/1b41f076e560c0a.jpg\" alt=\"DashInfer-VLM\uff0c\u591a\u6a21\u6001SOTA\u63a8\u7406\u6027\u80fd\uff0c\u8d85vLLM\uff01-1\" width=\"670\" height=\"428\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/1b41f076e560c0a.jpg 670w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/1b41f076e560c0a-300x192.jpg 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2025\/01\/1b41f076e560c0a-18x12.jpg 18w\" sizes=\"auto, (max-width: 670px) 100vw, 670px\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5f15\u8a00 DashInfer-VLM\u662f\u4e00\u4e2a\u9488\u5bf9\u4e8e\u89c6\u89c9\u591a\u6a21\u6001\u5927\u6a21\u578bVLM\u7684\u63a8\u7406\u67b6\u6784\uff0c\u7279\u522b\u4f18\u5316\u4e86Qwen VL\u6a21\u578b\u7684\u63a8\u7406\u52a0\u901f\uff0cDashInfer-VLM\u548c\u5176\u4ed6\u7684VLM\u7684\u63a8\u7406\u52a0\u901f\u6846\u67b6\u6700\u5927\u7684\u533a\u522b\u662f\uff0c \u5b83\u628aVIT\u90e8\u5206\u548cLLM\u90e8\u5206\u8fdb\u884c\u4e86\u5206\u79bb\uff0c\u5e76\u4e14VIT\u548cL&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[46],"tags":[],"class_list":["post-17679","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts\/17679","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/comments?post=17679"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/posts\/17679\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/media?parent=17679"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/categories?post=17679"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/en\/wp-json\/wp\/v2\/tags?post=17679"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}