A full-stack Retrieval-Augmented Generation (RAG) application that enables intelligent document querying and answer generation. Upload PDFs, build semantic search indexes, and get AI-powered answers grounded in your documents.
RAGify is a modern RAG prototype that combines document management, semantic search, and generative AI. It allows users to:
- Upload PDF documents through an intuitive web interface
- Index documents using embeddings for semantic similarity search
- Query documents with natural language questions
- Generate answers using Google Gemini, grounded in retrieved document content
The application consists of a FastAPI backend for document processing and retrieval, and a Next.js frontend for user interaction.
- 📄 PDF Upload & Processing: Extract and chunk text from PDF documents automatically
- 🔍 Semantic Search: Find relevant document excerpts using FAISS vector similarity
- 🤖 AI-Powered Answers: Generate contextual answers using Google Gemini API
- 💾 Persistent Indexing: FAISS indexes are saved and loaded from disk
- 🌐 Modern Web UI: Next.js frontend with responsive design using Tailwind CSS
- 📊 Index Statistics: Monitor indexed documents and embeddings metadata
- ⚡ REST API: Complete API for programmatic access to all features
- 🔐 CORS Support: Cross-origin requests enabled for frontend-backend communication
┌─────────────────────────────────────────────────────────────┐
│ Frontend (Next.js) │
│ React UI for uploads, search, and results │
└────────────────────────┬────────────────────────────────────┘
│ HTTP/REST
↓
┌─────────────────────────────────────────────────────────────┐
│ Backend (FastAPI) │
├─────────────────────────────────────────────────────────────┤
│ • PDF Ingestion & Text Extraction (pdfplumber) │
│ • Text Chunking (LangChain RecursiveCharacterTextSplitter) │
│ • Embeddings Generation (HuggingFace all-MiniLM-L6-v2) │
│ • Vector Storage (FAISS) │
│ • Semantic Search & Retrieval │
│ • Answer Generation (Google Gemini API) │
└─────────────────────────────────────────────────────────────┘
│
┌────────────────┼────────────────┐
↓ ↓ ↓
uploaded_pdfs faiss_index Gemini API
(PDFs) (Indexes) (Generation)
- Document Upload: User uploads PDF → Backend stores file & extracts text
- Indexing: Text is chunked → Each chunk is embedded → Embeddings stored in FAISS
- Search: User query is embedded → Similar chunks retrieved from FAISS
- Generation: Retrieved chunks + query → Gemini API generates answer
- Python 3.9+ (recommended 3.10 or 3.11)
- Node.js 18+ with npm or yarn
- Google Gemini API Key (from Google AI Studio or Cloud Console)
- Git (optional, for cloning the repository)
- Disk Space: At least 1GB for dependencies and indexes
- RAM: Minimum 2GB (4GB+ recommended for larger document sets)
- Internet: Required for API calls and initial dependency downloads
-
Create a Python virtual environment:
cd backend python -m venv venv -
Activate the virtual environment:
-
Windows (CMD):
venv\Scripts\activate
-
Windows (PowerShell):
venv\Scripts\Activate.ps1
-
macOS/Linux:
source venv/bin/activate
-
-
Install Python dependencies:
pip install -r requirements.txt
This installs:
fastapi- Web frameworkuvicorn- ASGI serverpdfplumber- PDF text extractionlangchain- LLM frameworkfaiss-cpu- Vector databasesentence-transformers- Embeddingsrequests- HTTP client- And additional supporting libraries
-
Navigate to frontend directory:
cd frontend -
Install Node dependencies:
npm install # or yarn installThis installs Next.js, React, TailwindCSS, and ESLint.
Environment variables can be set in a .env file in the backend/ directory:
# Google Gemini API Configuration
GEMINI_API_KEY=your-api-key-here
GEMINI_MODEL=models/flash-2.5
LLM_PROVIDER=gemini
# FastAPI Settings (optional)
UPLOAD_DIR=uploaded_pdfs
FAISS_INDEX_DIR=faiss_index- Go to Google AI Studio
- Click "Get API key" → "Create API key in new project"
- Copy your API key
- Add it to
.env:GEMINI_API_KEY=your-key-here
Edit backend/main.py to adjust:
CHUNK_SIZE = 500 # Characters per chunk
CHUNK_OVERLAP = 100 # Overlap between chunksSmaller chunks: More precise retrieval but more API calls Larger chunks: Broader context but may contain irrelevant info
Default: all-MiniLM-L6-v2 (384 dimensions, fast, efficient)
To change in backend/main.py:
EMBEDDING_MODEL = "all-mpnet-base-v2" # Better quality, slowerBackend URL is configured in Next.js app. Default: http://localhost:8000
To change the API endpoint, update frontend API calls in:
frontend/app/page.js- Main chat interfacefrontend/app/visualize/page.js- Visualization page (if exists)
With the virtual environment activated:
cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000Expected output:
INFO: Uvicorn running on http://0.0.0.0:8000
INFO: Application startup complete
Endpoints available at: http://localhost:8000
Interactive API docs: http://localhost:8000/docs
In a new terminal:
cd frontend
npm run dev
# or
yarn devExpected output:
▲ Next.js 16.1.6
- Local: http://localhost:3000
UI available at: http://localhost:3000
- Open
http://localhost:3000in your browser - Upload a PDF document
- Ask questions about the document
- Get AI-powered answers
http://localhost:8000
Health check endpoint.
Response:
{
"message": "Welcome to the FastAPI backend!"
}Upload and process a PDF file (returns chunks for preview).
Request:
- Form data with file upload
Response:
{
"filename": "document.pdf",
"content": "Full extracted text...",
"chunks": [
{
"id": 0,
"text": "Chunk content...",
"length": 450
}
],
"total_chunks": 12,
"chunk_settings": {
"chunk_size": 500,
"overlap": 100
}
}Upload PDF and store embeddings in FAISS index.
Request:
- Form data with file upload
Response:
{
"message": "Embeddings stored successfully",
"filename": "document.pdf",
"chunks_stored": 12,
"embedding_model": "all-MiniLM-L6-v2"
}Semantic search across indexed documents.
Request:
{
"query": "What is RAG?",
"top_k": 5
}Response:
{
"query": "What is RAG?",
"results": [
{
"text": "Retrieval-Augmented Generation...",
"source": "document.pdf",
"similarity_score": 0.87
}
],
"total_results": 1
}Retrieve relevant chunks and generate an answer using Gemini.
Request:
{
"query": "What is RAG?",
"top_k": 5
}Response:
{
"answer": "RAG (Retrieval-Augmented Generation) is...",
"matches": [
{
"text": "Relevant excerpt...",
"source": "document.pdf",
"score": 0.87
}
]
}Get statistics about the FAISS index.
Response (when indexed):
{
"indexed": true,
"total_documents": 45,
"embedding_model": "all-MiniLM-L6-v2",
"embedding_dimensions": 384
}Response (when empty):
{
"indexed": false,
"message": "No documents indexed yet"
}RAGify/
├── README.md # This file
├── backend/
│ ├── main.py # FastAPI application & route handlers
│ ├── requirements.txt # Python dependencies
│ ├── uploaded_pdfs/ # Directory for uploaded PDF files
│ ├── faiss_index/ # Directory for FAISS indexes
│ │ ├── index.faiss # Vector index (created on first upload)
│ │ └── index.pkl # Metadata (created on first upload)
│ └── __pycache__/ # Python cache
├── frontend/
│ ├── app/
│ │ ├── layout.js # Root layout
│ │ ├── page.js # Main chat interface
│ │ ├── globals.css # Global styles
│ │ ├── architecture/
│ │ │ └── page.js # Architecture documentation page
│ │ └── visualize/ # Visualization (if implemented)
│ ├── public/ # Static assets
│ ├── package.json # Node.js dependencies
│ ├── next.config.mjs # Next.js configuration
│ ├── jsconfig.json # JavaScript/TypeScript config
│ ├── postcss.config.mjs # PostCSS configuration
│ ├── eslint.config.mjs # ESLint rules
│ └── tailwind.config.js # Tailwind CSS config
└── pages/ # Additional pages
| Technology | Purpose | Version |
|---|---|---|
| FastAPI | Web framework & API | Latest |
| Uvicorn | ASGI server | Latest |
| Pdfplumber | PDF text extraction | Latest |
| LangChain | LLM framework & utilities | Latest |
| FAISS | Vector similarity search | CPU version |
| HuggingFace Transformers | Embeddings generation | 384-dim |
| Google Generative AI | Answer generation | Latest |
| Technology | Purpose | Version |
|---|---|---|
| Next.js | React framework | 16.1.6 |
| React | UI library | 19.2.3 |
| Tailwind CSS | Styling | 4.0 |
| ESLint | Code linting | 9.0 |
-
Start backend with auto-reload:
cd backend source venv/bin/activate # or venv\Scripts\activate on Windows uvicorn main:app --reload
-
Start frontend with hot-reload:
cd frontend npm run dev
# Test health check
curl http://localhost:8000
# Upload PDF for preview
curl -X POST -F "file=@document.pdf" http://localhost:8000/upload-pdf/
# Store embeddings
curl -X POST -F "file=@document.pdf" http://localhost:8000/store-embeddings/
# Search
curl -X POST http://localhost:8000/search/ \
-H "Content-Type: application/json" \
-d '{"query": "What is RAG?", "top_k": 5}'
# Get answer
curl -X POST http://localhost:8000/answer/ \
-H "Content-Type: application/json" \
-d '{"query": "What is RAG?", "top_k": 5}'
# Get index stats
curl http://localhost:8000/index-stats/- Multiple embedding models support
- Batch document processing
- Query result visualization
- Document metadata management
- Authentication & user sessions
- Support for other LLM providers (OpenAI, Claude, Llama)
- PDF visualization and highlighting
- Document versioning
- Advanced search filters
- Export results functionality