Build a Multi-Document RAG System with Embeddings + Claude
So you've played with basic RAG (Retrieval-Augmented Generation), but now you're ready to level up. In this tutorial, you’ll learn how to build a smarter RAG system that can:
- ๐พ Ingest multiple documents (PDF or TXT)
- ๐ Use embeddings for fast, relevant search
- ๐ค Generate responses using Claude based on the most relevant content
This is perfect for anyone creating a smart knowledge assistant, internal wiki search, or custom Q&A bot.
๐ฏ What We’ll Use
Tool | Purpose |
---|---|
Python | Core logic |
Streamlit | User interface |
Claude API | Answer generation |
OpenAI / TensorFlow | Embeddings |
FAISS | Efficient vector search |
PyMuPDF | PDF text extraction |
๐ฆ Step 1: Install the Tools
pip install streamlit faiss-cpu openai python-dotenv PyMuPDF anthropic tiktoken
๐ Create a .env
file with:
OPENAI_API_KEY=your-openai-api-key
ANTHROPIC_API_KEY=your-claude-api-key
---
๐ง Step 2: Define Helper Functions
import os
import fitz
import faiss
import openai
import tiktoken
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
# Text splitter
def split_text(text, chunk_size=500, overlap=100):
tokens = text.split()
chunks = []
for i in range(0, len(tokens), chunk_size - overlap):
chunk = " ".join(tokens[i:i + chunk_size])
chunks.append(chunk)
return chunks
# PDF extraction
def extract_text_from_pdf(pdf_file):
doc = fitz.open(stream=pdf_file.read(), filetype="pdf")
return "\n".join(page.get_text() for page in doc)
# Get embeddings
def get_embedding(text):
response = openai.Embedding.create(
input=[text],
model="text-embedding-ada-002"
)
return response["data"][0]["embedding"]
# Build FAISS index
def build_index(chunks):
dim = len(get_embedding("sample"))
index = faiss.IndexFlatL2(dim)
metadata = []
vectors = []
for chunk in chunks:
embedding = get_embedding(chunk)
vectors.append(embedding)
metadata.append(chunk)
index.add(np.array(vectors).astype("float32"))
return index, metadata
---
๐ผ️ Step 3: Create the Streamlit App
import streamlit as st
import numpy as np
import requests
# Claude API call
def ask_claude(question, context):
prompt = f"""
You are a helpful assistant. Use the following context to answer the question:
Context:
{context}
Question:
{question}
"""
headers = {
"x-api-key": os.getenv("ANTHROPIC_API_KEY"),
"anthropic-version": "2023-06-01",
"Content-Type": "application/json"
}
body = {
"model": "claude-3-haiku-20240307",
"temperature": 0.6,
"max_tokens": 800,
"messages": [{"role": "user", "content": prompt}]
}
res = requests.post("https://api.anthropic.com/v1/messages", headers=headers, json=body)
return res.json()["content"]
st.title("๐ Multi-Doc RAG System with Claude")
uploaded_files = st.file_uploader("Upload PDF or TXT files", type=["pdf", "txt"], accept_multiple_files=True)
query = st.text_input("What would you like to ask?")
if uploaded_files and query:
all_chunks = []
for f in uploaded_files:
if f.name.endswith(".pdf"):
text = extract_text_from_pdf(f)
else:
text = f.read().decode("utf-8")
chunks = split_text(text)
all_chunks.extend(chunks)
index, metadata = build_index(all_chunks)
query_embedding = np.array([get_embedding(query)]).astype("float32")
scores, indices = index.search(query_embedding, k=3)
relevant_chunks = [metadata[i] for i in indices[0]]
context = "\n---\n".join(relevant_chunks)
response = ask_claude(query, context)
st.subheader("๐ค Claude's Answer:")
st.markdown(response)
---
๐ Step 4: Run the App
streamlit run app.py
๐งช Test It With:
- Company policies, SOPs, or handbooks
- Research papers or manuals
- Multiple PDFs of notes or project docs
๐ง Next-Level Ideas
- ๐ Show source filenames or page numbers in the response
- ๐ธ Add thumbnail previews of each document
- ๐งต Save conversation history
- ๐ง Add a "re-ask" button to refine questions
✅ You Did It!
You now have a working multi-document RAG system using embeddings and Claude. This architecture is perfect for internal knowledge tools, smart wikis, chatbots, and assistants that “know your stuff.”
✨ Stay tuned for a follow-up tutorial: “How to Deploy Your Claude RAG Assistant to the Web”
Comments
Post a Comment