Core Concepts
RAG with LangChain
Build retrieval-augmented generation pipelines to answer questions from your documents.
RAG with LangChain
LangChain makes RAG (Retrieval-Augmented Generation) easy with built-in components for each step:
- Load documents (PDF, web, CSV, etc.)
- Split into chunks
- Embed and store in a vector store
- Retrieve relevant chunks
- Generate answers with context
Key Components
- Document Loaders: Load from 100+ sources
- Text Splitters: Chunk documents intelligently
- Embeddings: Convert text to vectors
- Vector Stores: Store and search embeddings (Chroma, Pinecone, pgvector)
- Retrievers: Query vector stores and filter results
Retrieval Strategies
- Similarity search: Find most similar chunks
- MMR (Maximum Marginal Relevance): Diverse results
- Self-query: LLM generates the search query
- Contextual compression: Compress retrieved docs
Example
python
# pip install langchain langchain-anthropic langchain-openai chromadb
from langchain_anthropic import ChatAnthropic
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
# Sample documents (in practice, use document loaders)
documents = [
Document(page_content="LangChain is a framework for building LLM applications. It supports Python and JavaScript.", metadata={"source": "docs"}),
Document(page_content="LCEL (LangChain Expression Language) uses the pipe operator to chain components.", metadata={"source": "docs"}),
Document(page_content="LangSmith is LangChain's observability platform for debugging and monitoring LLM apps.", metadata={"source": "docs"}),
Document(page_content="LangChain supports 100+ LLM providers including OpenAI, Anthropic, and local models.", metadata={"source": "docs"}),
]
# Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
chunks = splitter.split_documents(documents)
# Create vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)
# Create retriever
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 3}
)
# RAG prompt
rag_prompt = ChatPromptTemplate.from_messages([
("system", """Answer the question based on the context below.
If you don't know the answer, say so.
Context:
{context}"""),
("human", "{question}")
])
llm = ChatAnthropic(model="claude-3-5-haiku-20241022")
# Format documents helper
def format_docs(docs):
return "
".join(doc.page_content for doc in docs)
# Build RAG chain
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| rag_prompt
| llm
| StrOutputParser()
)
# Query
questions = [
"What is LCEL?",
"What observability tools does LangChain provide?",
"How many LLM providers does LangChain support?",
]
for q in questions:
print(f"Q: {q}")
print(f"A: {rag_chain.invoke(q)}
")
# With sources
from langchain_core.runnables import RunnableParallel
rag_with_sources = RunnableParallel(
answer=rag_chain,
sources=retriever
).pick(["answer", "sources"])Try it yourself — PYTHON