Google
recently announced that its first Gemini Embedding
The text model (gemini-embedding-001
) Adopted Gemini API
cap (a poem) Vertex AI
officially open to developers. Since its experimental launch in March 2025, the model has been in MTEB
(Massive Text Embedding Benchmark) multilingual leaderboards, signaling superior performance in cross-domain, mission-critical tasks.
Significance of MTEB Performance Benchmarking
MTEB
is an authoritative benchmark for the comprehensive capabilities of text embedding models that cover a wide range of tasks with different dimensions, from information retrieval to text categorization.gemini-embedding-001
The sustained leadership in this benchmark means that the model does not only excel in specific scenarios, but has a unified and robust semantic representation across diverse domains such as science, law, finance, and programming. According to Google
Published reports.gemini-embedding-001
The performance not only outperforms its own older models, but also outperforms other commercially available models on the market in several dimensions.
Core Technology: Matryoshka and Cost Flexibility
A core technique of the model is Matryoshka Representation Learning
(MRL). This technique allows developers to flexibly scale down the output dimension of the embedding vectors according to actual needs, with a default dimension of 3072.
This design is extremely developer friendly. In scenarios where the highest precision is required, such as exact matching of financial or legal documents, the full 3072 dimensions can be used. In more cost- and storage-sensitive applications, it can be scaled down to 1536 or 768 or even lower dimensions to find the optimal balance between performance and operating costs. This flexibility, combined with support for over 100 languages and an input limit of up to 2048 tokens, makes it a highly versatile model.
Developer Ecology and Pricing Strategy
Google
Provides developers with a clear path to access and competitive pricing.gemini-embedding-001
The price is per 1 million inputs token
0.15 and offers free usage credits to facilitate experimentation and prototyping by developers.
Developers can use the Gemini API
calls the model, and it is compatible with the existing embed_content
Endpoint Compatibility.
from google import genai
client = genai.Client()
result = client.models.embed_content(
model="gemini-embedding-001",
contents="What is the meaning of life?"
)
print(result.embeddings)
Meanwhile.Google
also announced plans to migrate the old model.embedding-001
will cease to be supported on August 14, 2025, and the text-embedding-004
Will be abandoned on 1/14/2026.Google
Developers are explicitly recommended to migrate their projects to the latest gemini-embedding-001
Model on.
This move is not only a technical iteration of the Google
an important step in solidifying its AI ecosystem. A robust, flexible and cost-controlled embedding model is the key to building high-level RAG
(Retrieval-Augmented Generation) and other applications. By providing developers with such a foundational tool, theGoogle
is enhancing its competitiveness in the AI infrastructure space. In addition, theGoogle
Also previewed and about to be supported Batch API
, processing data asynchronously at a much lower cost, which will further lower the barrier to large-scale adoption.