Interact with PDF documents using natural language! This project leverages local large language models (LLMs) and embedding-based vector search to answer questions about PDF files efficiently and privately.
- Ask natural language questions about the content of your PDFs.
 - Local inference using 
llama3:8bvia Ollama — no data leaves your machine. - Fast and lightweight vector search with 
DocArrayInMemorySearch. - Embedding powered by 
nomic-embed-textfor semantic understanding. 
- LLM: 
llama3:8bvia Ollama - PDF Loader: 
PyPDFLoaderfrom LangChain - Embeddings: 
nomic-embed-text - Vector Store: 
DocArrayInMemorySearch - Framework: Python + LangChain
 
- Python 3.10+
 - Ollama installed and running
 llama3model pulled via Ollama- Required Python packages installed (see 
requirements.txtor instructions in the Usage section) 
- Load a PDF document using PyPDFLoader.
 - Generate embeddings with 
nomic-embed-text. - Store and search using 
DocArrayInMemorySearch. - Query using 
llama3:8bfor context-aware responses. 
- Add a simple web UI using Streamlit or Gradio
 - Enable support for querying multiple PDFs
 - Add persistent vector store option (e.g., FAISS or Chroma)
 - Improve context retention and memory in conversations
 
This project is licensed under the MIT License. See the LICENSE file for details.