Core Concepts

RAG with LangChain

Build retrieval-augmented generation pipelines to answer questions from your documents.

RAG with LangChain

LangChain makes RAG (Retrieval-Augmented Generation) easy with built-in components for each step:

  1. Load documents (PDF, web, CSV, etc.)
  2. Split into chunks
  3. Embed and store in a vector store
  4. Retrieve relevant chunks
  5. Generate answers with context

Key Components

  • Document Loaders: Load from 100+ sources
  • Text Splitters: Chunk documents intelligently
  • Embeddings: Convert text to vectors
  • Vector Stores: Store and search embeddings (Chroma, Pinecone, pgvector)
  • Retrievers: Query vector stores and filter results

Retrieval Strategies

  • Similarity search: Find most similar chunks
  • MMR (Maximum Marginal Relevance): Diverse results
  • Self-query: LLM generates the search query
  • Contextual compression: Compress retrieved docs

Example

python
# pip install langchain langchain-anthropic langchain-openai chromadb
from langchain_anthropic import ChatAnthropic
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

# Sample documents (in practice, use document loaders)
documents = [
    Document(page_content="LangChain is a framework for building LLM applications. It supports Python and JavaScript.", metadata={"source": "docs"}),
    Document(page_content="LCEL (LangChain Expression Language) uses the pipe operator to chain components.", metadata={"source": "docs"}),
    Document(page_content="LangSmith is LangChain's observability platform for debugging and monitoring LLM apps.", metadata={"source": "docs"}),
    Document(page_content="LangChain supports 100+ LLM providers including OpenAI, Anthropic, and local models.", metadata={"source": "docs"}),
]

# Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
chunks = splitter.split_documents(documents)

# Create vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)

# Create retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)

# RAG prompt
rag_prompt = ChatPromptTemplate.from_messages([
    ("system", """Answer the question based on the context below.
If you don't know the answer, say so.

Context:
{context}"""),
    ("human", "{question}")
])

llm = ChatAnthropic(model="claude-3-5-haiku-20241022")

# Format documents helper
def format_docs(docs):
    return "

".join(doc.page_content for doc in docs)

# Build RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

# Query
questions = [
    "What is LCEL?",
    "What observability tools does LangChain provide?",
    "How many LLM providers does LangChain support?",
]
for q in questions:
    print(f"Q: {q}")
    print(f"A: {rag_chain.invoke(q)}
")

# With sources
from langchain_core.runnables import RunnableParallel

rag_with_sources = RunnableParallel(
    answer=rag_chain,
    sources=retriever
).pick(["answer", "sources"])
Try it yourself — PYTHON