Current Position:fig. beginning " AI Answers

How do you handle multimodal inputs (e.g. image + text) and function calls in AIRouter?

2025-08-21

211

AIRouter supports multimodal inputs and function calls by extending the API as follows:

Multi-modal inputs::
1. Images need to be converted to Base64 format, for example:
with open("image.jpg", "rb") as f: img_base64 = base64.b64encode(f.read()).decode()
2. Callsgenerate_mmmethod that specifies a model that supports multimodality (e.g., GPT-4o):
LLM_Wrapper.generate_mm(model_name="gpt4o_mini", prompt="描述图片", img_base64=img_base64)
function call::
1. Define a list of tools (e.g., weather query functions) with names, descriptions, and parameters.
2. Adoptionfunction_callingmethod trigger, for example:
LLM_Wrapper.function_calling(model_name="gpt4o_mini", prompt="北京天气", tools=tools)

take note of: You need to make sure that the selected model supports the corresponding function (e.g. GPT-4o supports multimodal), otherwise an error will be returned.

This answer comes from the articleAIRouter: Intelligent Routing Tool for Calling Multiple Models with Unified API InterfaceThe

May not be reproduced without permission:AI productivity tools " How do you handle multimodal inputs (e.g. image + text) and function calls in AIRouter?

How do you handle multimodal inputs (e.g. image + text) and function calls in AIRouter?

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

How do you handle multimodal inputs (e.g. image + text) and function calls in AIRouter?

Related articles

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool