Multimodal Capabilities
Long Context: 1 Million Token Window
Leverage Gemini's 1M token context window to process entire codebases, books, and long documents in a single API call.
The Context Window Advantage
Gemini 1.5 Pro and Flash support up to 1 million tokens of context — roughly 750,000 words, or an entire codebase. This is a genuine architectural differentiator. From our [model comparison article](/blog/claude-vs-gpt-vs-gemini):
> "Gemini's standout feature for coding is the massive context window — you can feed it an entire codebase (hundreds of files) and ask it to reason about cross-file dependencies."
What 1 Million Tokens Enables
| Content | Approx. Size |
|---|---|
| Lines of code | ~750,000 |
| Pages of text | ~1,500 |
| Hours of audio | ~9.5 hours |
| Minutes of video | ~60 minutes |
| Pages of PDF | ~300 pages |
Key Use Cases
Codebase Analysis
Feed an entire repository and ask: "Find all database calls that are missing error handling."
Document QA
Load a full technical specification and ask precise questions without chunking or RAG.
Long Conversation Memory
Maintain thousands of conversation turns in a single call without external memory management.
Book/Report Summarization
Summarize entire books or research papers without losing cross-chapter context.
Context Caching
For repeated queries against the same large document, Gemini offers context caching — upload the document once and reuse the cached version across multiple queries. This reduces cost and latency significantly.
When to Use RAG Instead
For very large collections (many documents totaling more than 1M tokens), or for fresh/real-time data that must stay current, RAG remains the right architecture. Long context is best when the entire corpus fits in a single call.
Example
import { GoogleGenerativeAI } from "@google/generative-ai";
import * as fs from "fs";
import * as path from "path";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro" });
// Load an entire codebase into context
function loadCodebase(dir: string, ext: string[] = [".ts", ".tsx", ".js"]): string {
let combined = "";
const entries = fs.readdirSync(dir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory() && !["node_modules", ".git", "dist"].includes(entry.name)) {
combined += loadCodebase(fullPath, ext);
} else if (entry.isFile() && ext.some(e => entry.name.endsWith(e))) {
combined += `\n// === ${fullPath} ===\n`;
combined += fs.readFileSync(fullPath, "utf-8");
}
}
return combined;
}
const codebase = loadCodebase("./src");
// Ask complex cross-file questions
const result = await model.generateContent([
`Here is the complete source code of our application:\n\n${codebase}`,
"Identify all functions that make database queries but do not handle errors. List them with file paths and line references.",
]);
console.log(result.response.text());
// Estimate token usage
const usage = result.response.usageMetadata;
console.log(`Tokens used — prompt: ${usage?.promptTokenCount}, output: ${usage?.candidatesTokenCount}`);