Skip to content

ConversePDF is an AI-powered document Q&A app that lets users upload PDFs and ask questions about them. It uses FastAPI for the backend, Streamlit for the UI, LlamaIndex to process documents and orchestrate RAG, Qdrant as a vector database to store document embeddings, and Inngest to handle background processing jobs asynchronously

Notifications You must be signed in to change notification settings

WillowsCosmic/ConversePDF

Repository files navigation

ConversePDF πŸ€–πŸ“„

AI-powered document Q&A application that lets users upload PDFs and ask questions about them using Retrieval-Augmented Generation (RAG)

Python FastAPI Streamlit LlamaIndex

🌟 Features

  • PDF Upload & Processing: Upload any PDF document for analysis
  • Intelligent Q&A: Ask natural language questions about your documents
  • Semantic Search: Retrieves relevant chunks using vector embeddings
  • RAG Architecture: Combines retrieval with Google Gemini LLM for accurate answers
  • Asynchronous Processing: Background jobs for efficient document processing
  • Vector Database: Persistent storage using Qdrant for embeddings
  • Modern UI: Clean, intuitive interface built with Streamlit

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Streamlit  β”‚  ← User Interface
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI   β”‚  ← Backend API
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
   β”Œβ”€β”€β”€β”΄β”€β”€β”€β”
   β”‚       β”‚
β”Œβ”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Qdrantβ”‚ β”‚ LlamaIndexβ”‚  ← Vector DB & RAG
β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
      β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
      β”‚ Gemini  β”‚  ← LLM (gemini-2.5-flash)
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Tech Stack

Backend

  • FastAPI: RESTful API framework
  • LlamaIndex: RAG orchestration and document processing
  • Qdrant: Vector database for embeddings storage
  • Google Gemini 2.5 Flash: Large Language Model
  • PyMuPDF: PDF parsing and text extraction
  • uvloop: High-performance async event loop

Frontend

  • Streamlit: Interactive web interface
  • Requests: HTTP client for API communication

Infrastructure

  • Docker: Qdrant containerization
  • Python 3.12: Core language
  • Virtual Environment: Dependency isolation

πŸ“‹ Prerequisites

  • Python 3.12+
  • Docker & Docker Compose
  • Google Gemini API Key (Get it here)

πŸš€ Installation

1. Clone the Repository

git clone https://github.com/WillowsCosmic/ConversePDF.git
cd ConversePDF

2. Create Virtual Environment

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the root directory:

GEMINI_API_KEY=your_gemini_api_key_here

5. Start Qdrant Vector Database

docker-compose up -d

This starts Qdrant on http://localhost:6333

πŸ’» Usage

Start the Backend Server

uvicorn main:app --reload --port 8000

The API will be available at http://localhost:8000

Start the Frontend

In a new terminal (with virtual environment activated):

streamlit run streamlit_app.py

The UI will open automatically at http://localhost:8501

Using the Application

  1. Upload a PDF: Click "Choose a PDF file" and select your document
  2. Wait for Processing: The system will chunk and embed your document
  3. Ask Questions: Type your question in the text input
  4. Adjust Retrieval: Use the slider to control how many chunks to retrieve (default: 5)
  5. Get Answers: Click "Ask" to receive AI-generated responses with source citations

πŸ“ Project Structure

ConversePDF/
β”œβ”€β”€ main.py                 # FastAPI backend application
β”œβ”€β”€ streamlit_app.py        # Streamlit frontend
β”œβ”€β”€ data_loader.py          # PDF processing & embedding logic
β”œβ”€β”€ vector_db.py            # Qdrant vector database operations
β”œβ”€β”€ custom_types.py         # Type definitions
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ docker-compose.yml      # Qdrant container configuration
β”œβ”€β”€ .gitignore             # Git ignore rules
β”œβ”€β”€ .python-version        # Python version specification
β”œβ”€β”€ pyproject.toml         # Project metadata
β”œβ”€β”€ uv.lock                # Dependency lock file
β”œβ”€β”€ qdrant_storage/        # Qdrant data persistence
└── notes.md               # Development notes

πŸ”§ Configuration

Chunking Parameters

In data_loader.py, you can adjust:

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,        # Characters per chunk
    chunk_overlap=200,      # Overlap between chunks
    length_function=len,
)

Retrieval Settings

In streamlit_app.py, modify the slider range:

top_k = st.slider("How many chunks to retrieve", 1, 10, 5)

LLM Model

In data_loader.py, change the Gemini model:

client = Gemini(api_key=os.getenv("GEMINI_API_KEY"), model="gemini-2.5-flash")

πŸ§ͺ API Endpoints

POST /upload

Upload a PDF file for processing

Request: Multipart form-data with PDF file Response:

{
  "message": "PDF uploaded successfully",
  "filename": "document.pdf",
  "job_id": "abc123"
}

POST /ask

Ask a question about the uploaded PDF

Request:

{
  "question": "What is an involute?",
  "top_k": 5
}

Response:

{
  "answer": "An involute is a curve traced by...",
  "sources": ["/path/to/document.pdf"]
}

πŸ› Troubleshooting

Qdrant Connection Issues

# Check if Qdrant is running
docker ps

# Restart Qdrant
docker-compose down
docker-compose up -d

Import Errors

# Reinstall dependencies
pip install --upgrade -r requirements.txt

API Key Errors

Ensure your .env file exists and contains:

GEMINI_API_KEY=your_actual_key

Screenshots

Screenshot from 2025-12-27 10-39-24 Screenshot from 2025-12-27 10-39-44

πŸ“š Key Technologies Explained

RAG (Retrieval-Augmented Generation)

Combines document retrieval with LLM generation:

  1. Chunk: Split documents into manageable pieces
  2. Embed: Convert chunks to vector representations
  3. Store: Save embeddings in vector database
  4. Retrieve: Find most relevant chunks for user query
  5. Generate: LLM creates answer using retrieved context

Vector Database (Qdrant)

Stores document embeddings and enables semantic search based on similarity rather than keyword matching.

LlamaIndex

Orchestrates the RAG pipeline, handling document loading, chunking, embedding, and querying.

πŸ™ Acknowledgments

  • Built as a learning project to understand RAG architecture
  • Powered by Google's Gemini LLM
  • Uses open-source tools: FastAPI, Streamlit, LlamaIndex, Qdrant

About

ConversePDF is an AI-powered document Q&A app that lets users upload PDFs and ask questions about them. It uses FastAPI for the backend, Streamlit for the UI, LlamaIndex to process documents and orchestrate RAG, Qdrant as a vector database to store document embeddings, and Inngest to handle background processing jobs asynchronously

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages