AI-powered document Q&A application that lets users upload PDFs and ask questions about them using Retrieval-Augmented Generation (RAG)
- PDF Upload & Processing: Upload any PDF document for analysis
- Intelligent Q&A: Ask natural language questions about your documents
- Semantic Search: Retrieves relevant chunks using vector embeddings
- RAG Architecture: Combines retrieval with Google Gemini LLM for accurate answers
- Asynchronous Processing: Background jobs for efficient document processing
- Vector Database: Persistent storage using Qdrant for embeddings
- Modern UI: Clean, intuitive interface built with Streamlit
βββββββββββββββ
β Streamlit β β User Interface
ββββββββ¬βββββββ
β
ββββββββΌβββββββ
β FastAPI β β Backend API
ββββββββ¬βββββββ
β
βββββ΄ββββ
β β
ββββΌββββ βββΌβββββββββ
βQdrantβ β LlamaIndexβ β Vector DB & RAG
ββββββββ βββ¬βββββββββ
β
ββββββΌβββββ
β Gemini β β LLM (gemini-2.5-flash)
βββββββββββ
- FastAPI: RESTful API framework
- LlamaIndex: RAG orchestration and document processing
- Qdrant: Vector database for embeddings storage
- Google Gemini 2.5 Flash: Large Language Model
- PyMuPDF: PDF parsing and text extraction
- uvloop: High-performance async event loop
- Streamlit: Interactive web interface
- Requests: HTTP client for API communication
- Docker: Qdrant containerization
- Python 3.12: Core language
- Virtual Environment: Dependency isolation
- Python 3.12+
- Docker & Docker Compose
- Google Gemini API Key (Get it here)
git clone https://github.com/WillowsCosmic/ConversePDF.git
cd ConversePDFpython3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the root directory:
GEMINI_API_KEY=your_gemini_api_key_heredocker-compose up -dThis starts Qdrant on http://localhost:6333
uvicorn main:app --reload --port 8000The API will be available at http://localhost:8000
In a new terminal (with virtual environment activated):
streamlit run streamlit_app.pyThe UI will open automatically at http://localhost:8501
- Upload a PDF: Click "Choose a PDF file" and select your document
- Wait for Processing: The system will chunk and embed your document
- Ask Questions: Type your question in the text input
- Adjust Retrieval: Use the slider to control how many chunks to retrieve (default: 5)
- Get Answers: Click "Ask" to receive AI-generated responses with source citations
ConversePDF/
βββ main.py # FastAPI backend application
βββ streamlit_app.py # Streamlit frontend
βββ data_loader.py # PDF processing & embedding logic
βββ vector_db.py # Qdrant vector database operations
βββ custom_types.py # Type definitions
βββ requirements.txt # Python dependencies
βββ docker-compose.yml # Qdrant container configuration
βββ .gitignore # Git ignore rules
βββ .python-version # Python version specification
βββ pyproject.toml # Project metadata
βββ uv.lock # Dependency lock file
βββ qdrant_storage/ # Qdrant data persistence
βββ notes.md # Development notes
In data_loader.py, you can adjust:
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=200, # Overlap between chunks
length_function=len,
)In streamlit_app.py, modify the slider range:
top_k = st.slider("How many chunks to retrieve", 1, 10, 5)In data_loader.py, change the Gemini model:
client = Gemini(api_key=os.getenv("GEMINI_API_KEY"), model="gemini-2.5-flash")Upload a PDF file for processing
Request: Multipart form-data with PDF file Response:
{
"message": "PDF uploaded successfully",
"filename": "document.pdf",
"job_id": "abc123"
}Ask a question about the uploaded PDF
Request:
{
"question": "What is an involute?",
"top_k": 5
}Response:
{
"answer": "An involute is a curve traced by...",
"sources": ["/path/to/document.pdf"]
}# Check if Qdrant is running
docker ps
# Restart Qdrant
docker-compose down
docker-compose up -d# Reinstall dependencies
pip install --upgrade -r requirements.txtEnsure your .env file exists and contains:
GEMINI_API_KEY=your_actual_key
Combines document retrieval with LLM generation:
- Chunk: Split documents into manageable pieces
- Embed: Convert chunks to vector representations
- Store: Save embeddings in vector database
- Retrieve: Find most relevant chunks for user query
- Generate: LLM creates answer using retrieved context
Stores document embeddings and enables semantic search based on similarity rather than keyword matching.
Orchestrates the RAG pipeline, handling document loading, chunking, embedding, and querying.
- Built as a learning project to understand RAG architecture
- Powered by Google's Gemini LLM
- Uses open-source tools: FastAPI, Streamlit, LlamaIndex, Qdrant