pdf-tutor 🎓

A RAG-based Generative AI tutor for providing interactive document assistance. Ask questions about your documents, research papers or technical manuals and receive an instant response locally.

Features

Context-Aware Learning (RAG): Uses a vector database to get answers from the provided document.
Vector Persistence: Caches document embeddings locally using FAISS, making subsequent queries on the same document instant.

Tech Stack

LLM: Ollama (Running Phi-3 )
Orchestration: LangChain
Vector DB: FAISS
Embeddings: all-MiniLM-L6-v2 (HuggingFace Transformers)
Document Parsing: PyPDF

Local Setup

1. Prerequisites

Python 3
Ollama installed on your machine.

2. Install AI Models

Open your terminal and pull the models required. I currently use phi3 because it's fast on my 8GB RAM MacBook. You can choose other models available on Ollama. E.g. Llama 3.1 and replace the model name in the main.py to 'llama3.1':

ollama pull phi3

# Or 
ollama pull the_model_name_of_your_choice

3. Clone and Install Dependencies

git clone https://github.com/kbismark/pdf-tutor.git
cd pdf-tutor

# Create a virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install required packages
pip install -r requirements.txt

4. Folder Structure

Ensure your directory looks like this:

pdf-tutor/
├── files/
│   └── your_document_name.pdf  <-- Place your PDF here. You will find "sql.pdf" there used for the demo.
├── vector_db_cache/       <-- Generated automatically on first run
├── main.py
└── requirements.txt

Usage

Place the PDF you want to learn from inside the files/ folder.
Open main.py and update the PDF_PATH variable with your filename.
Run the tutor:

python3 main.py

How it works under the hood:

Ingestion: The app loads your PDF and splits it into chunks of 1000 characters.
Embedding: Chunks are converted into 384-dimensional vectors using HuggingFace models.
Storage: Vectors are saved locally to vector_db_cache for instant reloading.
Retrieval: When you ask a question, the system finds the most relevant sections of the PDF.
Generation: The AI tutor receives the question + the relevant text and streams a structured response.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
configs		configs
files		files
prompts		prompts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdf-tutor 🎓

Features

Tech Stack

Local Setup

1. Prerequisites

2. Install AI Models

3. Clone and Install Dependencies

4. Folder Structure

Usage

How it works under the hood:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pdf-tutor 🎓

Features

Tech Stack

Local Setup

1. Prerequisites

2. Install AI Models

3. Clone and Install Dependencies

4. Folder Structure

Usage

How it works under the hood:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages