Skip to content

Developed a rag system which uses local LLM gemma3 from ollama to interact with vector DB (chromaDB) to answer the questions from document.

Notifications You must be signed in to change notification settings

atongse/RAG-QA-PDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ RAG System with Query Expansion (Ollama + ChromaDB)

This project is a Retrieval-Augmented Generation (RAG) application built with Streamlit, Ollama, and ChromaDB.
It allows users to upload PDF documents, store them as embeddings, expand user queries using an LLM, and generate citation-backed answers.


πŸš€ Features

  • πŸ“‚ Upload and process PDF documents
  • βœ‚οΈ Chunking with RecursiveCharacterTextSplitter
  • πŸ”Ž Query Expansion using LLM
  • πŸ“š Vector storage with ChromaDB (persistent)
  • πŸ€– Answer generation with citations
  • 🧠 Local LLM inference using Ollama
  • πŸ–₯️ Interactive Streamlit UI

πŸ› οΈ Tech Stack

  • Python
  • Streamlit
  • LangChain
  • Ollama
  • ChromaDB
  • all-MiniLM embeddings
  • Gemma / LLaMA models

πŸ“‚ Project Structure

.
β”œβ”€β”€ app.py
β”œβ”€β”€ data/ # Uploaded PDFs
β”œβ”€β”€ chromadb/ # Persistent vector store
β”œβ”€β”€ requirements.txt
└── README.md

βš™οΈ Prerequisites

1️⃣ Install Ollama Download and install Ollama:
πŸ‘‰ https://ollama.com

Pull required models:

ollama pull gemma3
ollama pull all-minilm

2️⃣ Install dependencies

pip install -r requirements.txt

▢️ How to Run the App

streamlit run app.py

πŸ§ͺ How to Use

Upload PDF files from the sidebar

Click Initialize System

Click Process Documents

Enter a question

Click Search

View answers with citations and sources

About

Developed a rag system which uses local LLM gemma3 from ollama to interact with vector DB (chromaDB) to answer the questions from document.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages