Comparisons 13 min read February 12, 2025

Vector Databases Explained: Pinecone vs Weaviate vs ChromaDB

Compare the top vector databases for AI applications. Real performance benchmarks, pricing breakdown, and hands-on code examples for each platform.

DevForge Team

AI Development Educators

Abstract visualization of vector space and data points

What is a Vector Database?

A vector database stores high-dimensional vectors — mathematical representations of semantic content — and enables fast similarity search over them.

When you embed a piece of text (or image, audio, etc.) using a model like OpenAI's text-embedding-3-small or Anthropic's Claude, you get a vector of 1,536 or more floating point numbers. This vector captures the *meaning* of the content, not just the literal text.

A vector database lets you find the most semantically similar vectors to a query vector — which is the foundation of RAG systems, semantic search, recommendation engines, and many other AI applications.

The Contenders

We'll compare three popular options:

Pinecone — Managed, serverless, battle-tested
Weaviate — Open-source with managed cloud option
ChromaDB — Open-source, developer-friendly, great for local dev

Plus we'll mention pgvector (Postgres extension) as the pragmatic option many teams choose.

Pinecone

Pinecone was the first purpose-built vector database to gain wide adoption. It's fully managed, has a generous free tier, and is optimized for production use.

Setup and Indexing

typescript

import { Pinecone } from '@pinecone-database/pinecone';

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! });

// Create an index
await pc.createIndex({
  name: 'devforge-docs',
  dimension: 1536,          // OpenAI text-embedding-3-small
  metric: 'cosine',
  spec: {
    serverless: {
      cloud: 'aws',
      region: 'us-east-1',
    }
  }
});

const index = pc.index('devforge-docs');

// Upsert vectors
await index.upsert([
  {
    id: 'doc-1',
    values: embedding,              // your embedding array
    metadata: {
      text: 'The original text chunk',
      source: 'tutorial-html.md',
      category: 'html',
    }
  }
]);

Querying

typescript

const results = await index.query({
  vector: queryEmbedding,
  topK: 5,
  filter: { category: { $eq: 'html' } },  // metadata filtering
  includeMetadata: true,
});

for (const match of results.matches) {
  console.log(`Score: ${match.score?.toFixed(3)}`);
  console.log(`Text: ${match.metadata?.text}`);
}

Pinecone Verdict

Best for: Production applications needing reliability and scale
Pricing: Free tier (100K vectors, 1 index), then ~$0.096/hour per pod
Latency: P95 ~20ms for typical queries
Strengths: Zero ops, excellent documentation, namespace support
Weaknesses: Vendor lock-in, cost at scale, no SQL-style queries

Weaviate

Weaviate is open-source with a managed cloud option. Its killer feature is the GraphQL-style query API that's more expressive than most vector databases.

Setup

typescript

import weaviate, { WeaviateClient } from 'weaviate-ts-client';

const client: WeaviateClient = weaviate.client({
  scheme: 'https',
  host: 'your-cluster.weaviate.network',
  apiKey: new weaviate.ApiKey(process.env.WEAVIATE_API_KEY!),
});

// Create a class (collection)
await client.schema.classCreator().withClass({
  class: 'Document',
  vectorizer: 'text2vec-openai',  // Weaviate can auto-vectorize!
  moduleConfig: {
    'text2vec-openai': {
      model: 'text-embedding-3-small',
    }
  },
  properties: [
    { name: 'content', dataType: ['text'] },
    { name: 'source', dataType: ['text'] },
    { name: 'category', dataType: ['text'] },
  ]
}).do();

Importing Data

typescript

// Weaviate can generate embeddings automatically
await client.data.creator()
  .withClassName('Document')
  .withProperties({
    content: 'HTML is the language of the web...',
    source: 'html-intro.md',
    category: 'html',
  })
  .do();

// Or import with pre-computed vectors
await client.data.creator()
  .withClassName('Document')
  .withProperties({ content: 'text here', source: 'source.md' })
  .withVector(precomputedEmbedding)
  .do();

Hybrid Search (Weaviate's Superpower)

typescript

const result = await client.graphql
  .get()
  .withClassName('Document')
  .withHybrid({
    query: 'how to use CSS flexbox',
    alpha: 0.75,  // 0 = pure keyword, 1 = pure vector
  })
  .withLimit(5)
  .withFields('content source category _additional { score }')
  .do();

Hybrid search combining vector and keyword search is Weaviate's standout feature.

Weaviate Verdict

Best for: Apps needing hybrid search, complex filtering, or self-hosting
Pricing: Free Sandbox, then $25/mo for small managed clusters
Strengths: Hybrid search, auto-vectorization, GraphQL API, self-hostable
Weaknesses: More complex setup, GraphQL learning curve

ChromaDB

ChromaDB is designed for simplicity. It's the easiest way to get a vector database running locally, making it perfect for development and smaller applications.

typescript

import { ChromaClient, OpenAIEmbeddingFunction } from 'chromadb';

const client = new ChromaClient();

const embedder = new OpenAIEmbeddingFunction({
  openai_api_key: process.env.OPENAI_API_KEY!,
  openai_model: 'text-embedding-3-small',
});

// Create or get a collection
const collection = await client.getOrCreateCollection({
  name: 'devforge-docs',
  embeddingFunction: embedder,
});

// Add documents (ChromaDB handles embeddings!)
await collection.add({
  ids: ['doc-1', 'doc-2', 'doc-3'],
  documents: [
    'HTML is the structure of web pages',
    'CSS styles HTML elements',
    'JavaScript adds interactivity',
  ],
  metadatas: [
    { subject: 'html', level: 'beginner' },
    { subject: 'css', level: 'beginner' },
    { subject: 'javascript', level: 'beginner' },
  ],
});

// Query
const results = await collection.query({
  queryTexts: ['how to make text bold on a webpage'],
  nResults: 3,
  where: { subject: 'html' },
});

console.log(results.documents);
console.log(results.distances);

ChromaDB Verdict

Best for: Local development, prototyping, small applications
Pricing: Free and open source. ChromaDB Cloud available
Strengths: Incredibly simple API, works locally, great for dev
Weaknesses: Less battle-tested at scale, limited advanced features

pgvector: The Pragmatic Option

If you're already using PostgreSQL (or Supabase), pgvector is often the right choice:

sql

-- Enable extension
CREATE EXTENSION vector;

-- Add embedding column to existing table
ALTER TABLE documents ADD COLUMN embedding vector(1536);

-- Create index
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops);

-- Similarity search
SELECT *, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE 1 - (embedding <=> $1) > 0.7
ORDER BY embedding <=> $1
LIMIT 10;

pgvector wins when: You're already on Postgres, want SQL-style queries, have existing data, or want to avoid a new service dependency.

Comparison Table

|---------|----------|----------|----------|---------|

| Self-hostable | No | Yes | Yes | Yes |

| Free tier | Yes | Yes | Yes | Yes |

| Managed cloud | Yes | Yes | Yes | Supabase |

My Recommendation

For a new production app: Supabase + pgvector. You already need Postgres for your application data. Adding vector search is just an extension. Supabase makes this trivial and the management overhead is minimal.

For complex search requirements: Weaviate. Its hybrid search and filtering capabilities are genuinely superior.

For rapid prototyping: ChromaDB. Nothing is faster to get running.

For a pure vector workload at scale: Pinecone. When you need reliability and aren't on Postgres already, Pinecone's simplicity and battle-testing justify the cost.

The best vector database is the one that fits your existing stack. Don't add infrastructure complexity unless you have a specific reason to.

#Vector Databases#Pinecone#Weaviate#ChromaDB#AI#RAG