Advanced Features
Embeddings with Gemini
Generate text embeddings with Gemini's embedding model for semantic search, clustering, and RAG pipelines.
Text Embeddings
Embeddings convert text into numerical vectors that capture semantic meaning. Similar texts produce vectors that are close in vector space, enabling semantic search, clustering, and retrieval-augmented generation (RAG).
Gemini Embedding Models
| Model | Dimensions | Best For |
|---|---|---|
| text-embedding-004 | 768 | General text embedding, semantic search |
| embedding-001 | 768 | Legacy, use text-embedding-004 |
Task Types
Gemini embeddings support task-type hints that optimize the vector for specific use cases:
| Task Type | Use When |
|---|---|
| RETRIEVAL_DOCUMENT | Embedding documents to be stored in an index |
| RETRIEVAL_QUERY | Embedding queries against a document index |
| SEMANTIC_SIMILARITY | Comparing similarity between pairs of texts |
| CLASSIFICATION | Classifying texts into categories |
| CLUSTERING | Grouping similar texts together |
Batch Embedding
For large datasets, use batchEmbedContents() to embed multiple texts in one API call — much more efficient than individual calls.
Cosine Similarity
The standard similarity metric for embedding vectors. A value of 1.0 means identical, 0 means orthogonal (unrelated), -1 means opposite.
Building a Semantic Search System
The typical flow for embedding-based search:
- Embed each document with task type RETRIEVAL_DOCUMENT
- Store vectors in a vector database (pgvector, ChromaDB, Pinecone)
- At query time, embed the query with task type RETRIEVAL_QUERY
- Find the k nearest vectors using cosine similarity
- Return the original documents as context for the LLM
Example
typescript
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
const embedModel = genAI.getGenerativeModel({ model: "text-embedding-004" });
// --- Single embedding ---
const docResult = await embedModel.embedContent({
content: { parts: [{ text: "Gemini has a 1 million token context window." }], role: "user" },
taskType: "RETRIEVAL_DOCUMENT",
});
console.log("Vector dimensions:", docResult.embedding.values.length); // 768
// --- Batch embedding for a document corpus ---
const documents = [
"Gemini 1.5 Pro supports 1M token context",
"Claude API uses the messages.create() method",
"GPT-4o is OpenAI's multimodal flagship model",
"Vector databases store embeddings for similarity search",
];
const batchResult = await embedModel.batchEmbedContents({
requests: documents.map(text => ({
content: { parts: [{ text }], role: "user" },
taskType: "RETRIEVAL_DOCUMENT",
})),
});
const vectors = batchResult.embeddings.map(e => e.values);
// --- Cosine similarity ---
function cosineSimilarity(a: number[], b: number[]): number {
const dot = a.reduce((sum, v, i) => sum + v * b[i], 0);
const magA = Math.sqrt(a.reduce((sum, v) => sum + v * v, 0));
const magB = Math.sqrt(b.reduce((sum, v) => sum + v * v, 0));
return dot / (magA * magB);
}
// --- Query ---
const queryResult = await embedModel.embedContent({
content: { parts: [{ text: "What model has the largest context window?" }], role: "user" },
taskType: "RETRIEVAL_QUERY",
});
const queryVec = queryResult.embedding.values;
const scores = vectors.map((vec, i) => ({
document: documents[i],
score: cosineSimilarity(queryVec, vec),
}));
scores.sort((a, b) => b.score - a.score);
console.log("Top result:", scores[0].document);
// "Gemini 1.5 Pro supports 1M token context"Try it yourself — TYPESCRIPT