A comprehensive Retrieval Augmented Generation (RAG) chatbot implementation designed to serve as an expert knowledge worker for tech companies. This project provides accurate, cost-effective question-answering capabilities by combining document retrieval with large language models.
This RAG pipeline creates an intelligent chatbot that can answer questions based on your company's knowledge base with high accuracy. The system uses vector embeddings to find relevant documents and combines them with LLM responses to provide contextually accurate answers.
- Document Processing: Automatically processes documents from your knowledge base
- Vector Storage: Creates and manages vector embeddings for semantic search
- Interactive Chat: Web-based chat interface using Gradio
- Visualization: 2D/3D vector space visualization for understanding document relationships
- Conversation Memory: Maintains context across conversation turns
- Flexible Architecture: Supports multiple embedding and vector database options
- Python 3.8+
- OpenAI API key (or alternatives as described below)
- Clone the repository:
git clone https://github.com/Sameeh07/RAG_Chatbot.git
cd RAG_Chatbot- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
# Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env- Create your knowledge base structure:
knowledge-base/
βββ company/
β βββ *.md files
βββ products/
β βββ *.md files
βββ employees/
β βββ *.md files
βββ contracts/
βββ *.md files
- Open the Jupyter notebook:
jupyter notebook RAG.ipynb-
Run all cells to:
- Load and process your documents
- Create vector embeddings
- Set up the conversation chain
- Launch the Gradio chat interface
-
Access the chat interface in your browser and start asking questions!
Documents β Text Splitting β Embeddings β Vector Store β Retrieval β LLM β Response
The system supports multiple embedding options:
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")from langchain.embeddings import HuggingFaceEmbeddings
# Using BERT-based models
embeddings = HuggingFaceEmbeddings(model_name="bert-base-uncased")from langchain.embeddings import LlamaCppEmbeddings
embeddings = LlamaCppEmbeddings(model_path="path/to/your/model.bin")Choose from multiple vector database options:
from langchain_chroma import Chroma
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="vector_db"
)from langchain.vectorstores import FAISS
vectorstore = FAISS.from_documents(chunks, embeddings)
# Save for persistence
vectorstore.save_local("faiss_index")from langchain.vectorstores import Pinecone
import pinecone
pinecone.init(api_key="your_pinecone_api_key", environment="your_env")
vectorstore = Pinecone.from_documents(chunks, embeddings, index_name="your_index")from langchain_openai import ChatOpenAI
llm = ChatOpenAI(temperature=0.7, model_name="gpt-4o-mini")llm = ChatOpenAI(
temperature=0.7,
model_name='llama3.2',
base_url='http://localhost:11434/v1',
api_key='ollama'
)The notebook includes visualization capabilities to understand how your documents are embedded in vector space:
- 2D Visualization: t-SNE reduction for document clustering analysis
- 3D Visualization: Enhanced spatial understanding of document relationships
- Color Coding: Documents grouped by type for easy identification
langchain
langchain-openai
langchain-chroma
chromadb
gradio
python-dotenv
numpy
matplotlib
plotly
scikit-learn
- Embedding Models: Sentence-transformers models are faster but may be less accurate than OpenAI
- Vector Databases: FAISS is faster for large datasets; Pinecone offers cloud scalability
- Chunk Size: Balance between context richness (larger chunks) and precision (smaller chunks)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- LangChain for the RAG framework
- Chroma for the vector database
- OpenAI for embeddings and language models
- Gradio for the chat interface
Note: This RAG chatbot is designed for internal company use. Ensure your knowledge base doesn't contain sensitive information that shouldn't be processed by external APIs when using cloud-based models.