Core API

Embeddings

Use OpenAI embeddings to convert text to vectors for semantic search and similarity.

What are Embeddings?

An embedding is a vector (list of numbers) that represents text semantically. Similar meanings produce similar vectors.

Use Cases

  • Semantic search: Find documents by meaning, not just keywords
  • Recommendation systems: Find similar items
  • Clustering: Group similar texts
  • Classification: Use embeddings as features for ML models
  • RAG (Retrieval-Augmented Generation): Find relevant context for LLMs

How to Use

  1. Convert your texts to embeddings using the API
  2. Store the embeddings (in a vector database or just in memory)
  3. When querying, embed the query and find the most similar embeddings
  4. Use cosine similarity to measure how similar two vectors are

Example

javascript
import OpenAI from 'openai';

const openai = new OpenAI();

// Generate embeddings
async function embed(text) {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",  // or text-embedding-3-large
    input: text,
  });
  return response.data[0].embedding;  // array of ~1536 floats
}

// Cosine similarity
function cosineSimilarity(a, b) {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

// Semantic search example
async function semanticSearch(query, documents) {
  const queryEmbedding = await embed(query);

  // Embed all documents
  const docEmbeddings = await Promise.all(
    documents.map(doc => embed(doc))
  );

  // Rank by similarity
  const similarities = docEmbeddings.map((docEmb, i) => ({
    document: documents[i],
    similarity: cosineSimilarity(queryEmbedding, docEmb),
  }));

  return similarities
    .sort((a, b) => b.similarity - a.similarity)
    .slice(0, 3);  // top 3 results
}

// Test
const docs = [
  "JavaScript is a programming language for web development",
  "Python is great for data science and machine learning",
  "React is a library for building user interfaces",
  "TensorFlow is used for deep learning",
  "CSS styles web pages",
];

const results = await semanticSearch("frontend web development", docs);
results.forEach(({ document, similarity }) => {
  console.log(`${similarity.toFixed(3)} - ${document}`);
});
Try it yourself — JAVASCRIPT