A Streamlit-based chatbot powered by Retrieval-Augmented Generation (RAG) and OpenAI. Upload your PDFs and chat with them! This app leverages LangChain, FAISS, and OpenAI’s GPT models to extract and query document content with metadata-aware answers.
- 🔍 Upload multiple PDFs and query across all of them
 - 📄 Metadata-rich answers with filename and page references
 - 🧠 Uses LangChain + FAISS for semantic search
 - 🤖 Streamlit Chat UI for natural conversation
 - 💾 OpenAI API support with streaming responses
 
.
├── .gitignore
├── LICENSE
├── README.md             # ← You're reading it
├── app.py                # Main Streamlit app
├── brain.py              # PDF parsing and vector index logic
├── compare medium.gif    # Optional UI illustration
├── requirements.txt      # Python dependencies
└── thumbnail.webp        # Preview image
git clone https://github.com/aimaster-dev/chatbot-using-rag-and-langchain.git
cd chatbot-using-rag-and-langchainpip install -r requirements.txtCreate a .streamlit/secrets.toml file with:
OPENAI_API_KEY = "your-openai-key"Or export it via environment variable:
export OPENAI_API_KEY="your-openai-key"streamlit run app.py- Upload PDFs via the UI
 - Each PDF is parsed using 
PyPDF2and chunked via LangChain’sRecursiveCharacterTextSplitter - Chunks are embedded using OpenAI Embeddings
 - Stored in a FAISS vector store for semantic similarity search
 - Queries are matched to top PDF chunks and passed to ChatGPT with context
 - Answers include file name and page number metadata for citation
 
- Streamlit – UI framework
 - LangChain – PDF chunking and retrieval
 - FAISS – Vector search backend
 - OpenAI GPT – LLM-based answer generation
 - PyPDF2 – PDF parsing
 
"What are the main points from the introduction?"
Answer: The introduction highlights... (example.pdf, page 1)
This project is licensed under the MIT License.
Made with ❤️ by aimaster-dev. Contributions welcome!