A comprehensive Retrieval-Augmented Generation (RAG) chatbot built with Google's Gemini API, featuring advanced ML algorithms, semantic caching, document reranking, and a beautiful Streamlit interface.
- Multi-format Document Support: PDF, DOCX, TXT, MD, HTML, CSV, JSON, XML
- Intelligent Document Processing: Semantic chunking, metadata extraction, quality scoring
- Advanced Vector Storage: ChromaDB, FAISS, and hybrid storage options
- Semantic Search: High-quality embeddings with similarity search
- Query Analysis & Classification: Intent detection, sentiment analysis, complexity scoring
- Query Expansion: Automatic query enhancement using T5 and synonym expansion
- Document Reranking: Cross-encoder models for improved relevance
- Semantic Caching: Intelligent caching based on semantic similarity
- Performance Monitoring: Real-time metrics and optimization recommendations
- Multi-model Support: Gemini Pro, with extensibility for other LLMs
- Streaming Responses: Real-time response generation
- Error Handling: Comprehensive error handling and recovery
- Health Monitoring: System health checks and diagnostics
- Analytics Dashboard: Performance visualization and insights
- Modern Streamlit UI: Clean, responsive design with dark/light themes
- Chat Interface: Interactive chat with message history
- Document Management: Easy file upload and processing
- Real-time Analytics: Performance metrics and visualizations
- System Health Dashboard: Component status and diagnostics
advanced-rag-chatbot/
βββ requirements.txt # Python dependencies
βββ config.py # Configuration management
βββ vector_store.py # Vector database implementations
βββ ml_utils.py # ML algorithms and utilities
βββ document_processor.py # Document processing pipeline
βββ rag_engine.py # Main RAG engine
βββ streamlit_app.py # Streamlit web interface
βββ main.py # Entry point and CLI
βββ .env.example # Environment variables template
βββ README.md # This file
βββ data/ # Data storage
β βββ chromadb/ # ChromaDB persistence
β βββ faiss_index/ # FAISS indices
β βββ semantic_cache.db # SQLite cache
βββ models/ # ML model storage
βββ logs/ # Application logs
βββ temp_uploads/ # Temporary file storage
- Python 3.8+
- Google Gemini API key
- 8GB+ RAM recommended
- 2GB+ disk space
git clone https://github.com/yourusername/advanced-rag-chatbot.git
cd advanced-rag-chatbotpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txt# Download spaCy model
python -m spacy download en_core_web_sm
# Download NLTK data (automatic on first run)
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet')"cp .env.example .env
# Edit .env and add your API keyspython main.py --setup- Gemini API Key: Get from Google AI Studio
- Optional Keys: OpenAI, Cohere, Anthropic, HuggingFace for extended functionality
Edit the .env file with your configuration:
# Required
GEMINI_API_KEY=your_gemini_api_key_here
# Optional for enhanced features
OPENAI_API_KEY=your_openai_key
HUGGINGFACE_TOKEN=your_hf_token
# Performance tuning
CHUNK_SIZE=512
SIMILARITY_THRESHOLD=0.7
ENABLE_RERANKING=true
ENABLE_CACHING=truepython main.py --mode webThen open http://localhost:8501 in your browser.
python main.py --mode clipython main.py --mode teststreamlit run streamlit_app.py- Enter your Gemini API key in the sidebar
- Configure model parameters (temperature, max tokens, etc.)
- Enable desired ML features
- Click "Initialize RAG Engine"
- Go to the "Documents" tab
- Upload files (PDF, DOCX, TXT, etc.)
- Click "Process and Add Documents"
- Wait for processing to complete
- Go to the "Chat" tab
- Ask questions about your documents
- View sources and confidence scores
- Explore response metadata
- Check the "Analytics" tab for performance metrics
- View response time trends and confidence distributions
- Monitor system health in the "System Health" tab
- Semantic Chunking: Context-aware text splitting
- Quality Scoring: Multi-factor document quality assessment
- Metadata Extraction: Automatic title, topic, and entity extraction
- Language Detection: Automatic language identification
- Intent Classification: Categorizes user queries by intent
- Query Expansion: Enhances queries with synonyms and paraphrases
- Complexity Analysis: Measures query complexity and difficulty
- Entity Recognition: Extracts named entities from queries
- Hybrid Search: Combines multiple retrieval strategies
- Document Reranking: Cross-encoder models for relevance scoring
- Semantic Caching: Caches responses for similar queries
- Adaptive Retrieval: Adjusts retrieval based on query type
- Response Time Monitoring: Tracks and optimizes response times
- Memory Management: Efficient memory usage and garbage collection
- Caching Strategies: Multi-level caching for improved performance
- Load Balancing: Distributes processing across resources
- Response time statistics (average, P95, P99)
- Confidence score distributions
- Cache hit rates and efficiency
- Error rates and failure analysis
- Intent and question type distributions
- Complexity analysis over time
- Popular topics and entities
- User interaction patterns
- Component status monitoring
- Resource usage tracking
- Error logging and alerting
- Automated performance recommendations
# ChromaDB (default)
vector_store_type = "chromadb"
# FAISS for high-performance
vector_store_type = "faiss"
# Hybrid for best of both
vector_store_type = "hybrid"# Custom embedding model
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
# Custom reranking model
reranker_model = "cross-encoder/ms-marco-MiniLM-L-6-v2"
# Enable/disable features
enable_reranking = True
enable_query_expansion = True
enable_semantic_caching = True# Chunk size optimization
chunk_size = 512
chunk_overlap = 50
# Retrieval parameters
similarity_threshold = 0.7
max_retrieval_docs = 10
# Response generation
temperature = 0.7
max_tokens = 20481. Import Errors
pip install -r requirements.txt
python -m spacy download en_core_web_sm2. API Key Issues
- Verify your Gemini API key is correct
- Check API quotas and billing
- Ensure API key has proper permissions
3. Memory Issues
- Reduce chunk size in config.py
- Enable garbage collection
- Use FAISS instead of ChromaDB for large datasets
4. Slow Performance
- Enable semantic caching
- Reduce similarity threshold
- Use smaller embedding models
Check rag_chatbot.log for detailed error information:
tail -f rag_chatbot.logpython main.py --mode testpython -m pytest tests/ -vpython main.py --mode testpython -m locust -f tests/load_test.pypython main.py --mode webdocker build -t rag-chatbot .
docker run -p 8501:8501 -e GEMINI_API_KEY=your_key rag-chatbot- Streamlit Cloud: Direct deployment from GitHub
- AWS/GCP/Azure: Use containerization
- Kubernetes: Provided YAML configurations
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install development dependencies
pip install -r requirements-dev.txt
# Run code formatting
black .
isort .
# Run linting
flake8 .
# Run tests
pytestThis project is licensed under the MIT License - see the LICENSE file for details.
- Google for the Gemini API
- Streamlit for the amazing web framework
- ChromaDB and FAISS for vector storage
- Sentence Transformers for embeddings
- The open-source ML community
- Documentation: Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@yourproject.com
- Multi-modal support (images, audio)
- Advanced conversation memory
- Custom model fine-tuning
- Enhanced security features
- Collaborative chat features
- Advanced analytics dashboard
- Mobile application
- Enterprise SSO integration
- Distributed processing
- Real-time learning
- Advanced reasoning capabilities
- Multi-language support
Built with β€οΈ by the Advanced RAG Team
Star β this repository if you find it helpful! python main.py --mode web
./deploy.sh local ./deploy.sh docker