Advanced Features

Embeddings with Gemini

Generate text embeddings with Gemini's embedding model for semantic search, clustering, and RAG pipelines.

Text Embeddings

Embeddings convert text into numerical vectors that capture semantic meaning. Similar texts produce vectors that are close in vector space, enabling semantic search, clustering, and retrieval-augmented generation (RAG).

Gemini Embedding Models

ModelDimensionsBest For
text-embedding-004768General text embedding, semantic search
embedding-001768Legacy, use text-embedding-004

Task Types

Gemini embeddings support task-type hints that optimize the vector for specific use cases:

Task TypeUse When
RETRIEVAL_DOCUMENTEmbedding documents to be stored in an index
RETRIEVAL_QUERYEmbedding queries against a document index
SEMANTIC_SIMILARITYComparing similarity between pairs of texts
CLASSIFICATIONClassifying texts into categories
CLUSTERINGGrouping similar texts together

Batch Embedding

For large datasets, use batchEmbedContents() to embed multiple texts in one API call — much more efficient than individual calls.

Cosine Similarity

The standard similarity metric for embedding vectors. A value of 1.0 means identical, 0 means orthogonal (unrelated), -1 means opposite.

Building a Semantic Search System

The typical flow for embedding-based search:

  1. Embed each document with task type RETRIEVAL_DOCUMENT
  2. Store vectors in a vector database (pgvector, ChromaDB, Pinecone)
  3. At query time, embed the query with task type RETRIEVAL_QUERY
  4. Find the k nearest vectors using cosine similarity
  5. Return the original documents as context for the LLM

Example

typescript
import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
const embedModel = genAI.getGenerativeModel({ model: "text-embedding-004" });

// --- Single embedding ---
const docResult = await embedModel.embedContent({
  content: { parts: [{ text: "Gemini has a 1 million token context window." }], role: "user" },
  taskType: "RETRIEVAL_DOCUMENT",
});
console.log("Vector dimensions:", docResult.embedding.values.length); // 768

// --- Batch embedding for a document corpus ---
const documents = [
  "Gemini 1.5 Pro supports 1M token context",
  "Claude API uses the messages.create() method",
  "GPT-4o is OpenAI's multimodal flagship model",
  "Vector databases store embeddings for similarity search",
];

const batchResult = await embedModel.batchEmbedContents({
  requests: documents.map(text => ({
    content: { parts: [{ text }], role: "user" },
    taskType: "RETRIEVAL_DOCUMENT",
  })),
});

const vectors = batchResult.embeddings.map(e => e.values);

// --- Cosine similarity ---
function cosineSimilarity(a: number[], b: number[]): number {
  const dot = a.reduce((sum, v, i) => sum + v * b[i], 0);
  const magA = Math.sqrt(a.reduce((sum, v) => sum + v * v, 0));
  const magB = Math.sqrt(b.reduce((sum, v) => sum + v * v, 0));
  return dot / (magA * magB);
}

// --- Query ---
const queryResult = await embedModel.embedContent({
  content: { parts: [{ text: "What model has the largest context window?" }], role: "user" },
  taskType: "RETRIEVAL_QUERY",
});

const queryVec = queryResult.embedding.values;
const scores = vectors.map((vec, i) => ({
  document: documents[i],
  score: cosineSimilarity(queryVec, vec),
}));

scores.sort((a, b) => b.score - a.score);
console.log("Top result:", scores[0].document);
// "Gemini 1.5 Pro supports 1M token context"
Try it yourself — TYPESCRIPT