There are two main ways to get an AI response using the Gemini-CLI-2-API:
Basic Chat Request
This can be done by sending a POST request to the /v1/chat/completions endpoint, example:
curl http://localhost:8000/v1/chat/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer sk-your-key"
-d '{
"model": "gemini-2.5-pro",
"messages": [
{"role": "system", "content": "你是一个翻译助手"},
{"role": "user", "content": "将这句中文翻译成英文"}
]
}'
Streaming Response Requests
To get the response in real time, set "stream": true:
curl http://localhost:8000/v1/chat/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer sk-your-key"
-d '{
"model": "gemini-2.5-pro",
"stream": true,
"messages": [
{"role": "user", "content": "写一首关于春天的诗"}
]
}'
In addition, the list of supported models can be queried via the /v1/models endpoint. Note that the request format follows the OpenAI API specification exactly, for easy integration with existing tools.
This answer comes from the articleGemini-CLI-2-API: Converting the Gemini CLI to an OpenAI-compatible Native API ServiceThe





























