An intelligent AI-powered chatbot system designed to help resolve student queries efficiently using Retrieval-Augmented Generation (RAG) technology. The system features both user and admin interfaces, allowing for dynamic query handling, AI-powered responses, and comprehensive query management with cloud storage integration.
- RAG (Retrieval-Augmented Generation) for intelligent responses
- Google AI (Gemini) integration for natural language processing
- LangChain framework for advanced language processing
- Pinecone vector database for semantic search and embedding storage
- PDF document processing for knowledge base creation with Cloudinary integration
- Secure user signup and login
- JWT-based authentication with session management
- Password encryption using Bcrypt
- Admin authentication with role-based access
- Real-time AI-powered query responses
- Automatic query storage and categorization
- Historical query search and analytics
- Unanswered query tracking for admin review
- Dedicated admin login system
- Query analytics and statistics
- Manage unanswered queries with bulk operations
- Add and update responses manually
- PDF management with Cloudinary integration
- One-click embedding rebuilding for knowledge base updates
- Email notification system for critical queries
- Chat history management
- Automated email notifications
- Admin alerts for unanswered queries
- Test email functionality for system verification
- Cloudinary PDF storage for document management
- PDF preview with Google Docs viewer
- Original filename preservation
- Public/private access management
- Flask 2.3.3 - Modern Python web framework
- Flask-CORS - Cross-Origin Resource Sharing support
- Gunicorn - WSGI HTTP Server for production deployment
- MongoDB with PyMongo - NoSQL database for scalable data storage
- Google AI (Gemini) - Advanced language model integration
- LangChain - AI application framework
- Pinecone - Vector database for production-ready RAG systems
- Cloudinary - Cloud-based PDF storage and management
- Flask-Mail - Email service integration
- JWT - Secure token-based authentication
- Flask-CORS - Cross-origin resource sharing
- React 18.3.1 - Modern UI library
- Vite - Fast build tool and dev server
- Tailwind CSS - Utility-first CSS framework
- ShadCN/UI - Modern component library
- React Router - Client-side routing
- Axios - HTTP client for API communication
- Lucide React - Beautiful icons
- Google AI API - Generative AI responses
- LangChain Community - Extended AI capabilities
- Pinecone - Managed vector database for production RAG deployments
- TextBlob - Text processing and analysis
- PyPDF2 - PDF document processing
- ReportLab - PDF generation for Q&A documents
- Cloudinary SDK - Cloud storage API integration
POST /api/signup # Register new user
POST /api/login # User login
POST /api/admin/login # Admin authentication
POST /api/query # Submit query (AI-powered response)
GET /api/chat-history # Retrieve user chat history
POST /api/add-response # Add manual response to query
GET /api/admin/stats # Get system statistics
GET /api/admin/chat-history # Get all chat history (admin)
GET /api/admin/query-analytics # Get query analytics
GET /api/unanswered-queries # Get pending queries
DELETE /api/delete-query/<id> # Delete specific query
GET /api/pdfs/ # List all PDFs
POST /api/pdfs/upload # Upload a PDF file
DELETE /api/pdfs/<public_id> # Delete a PDF
POST /api/pdfs/rebuild-embeddings # Rebuild embeddings from PDFs
GET /health # Health check
GET /debug/routes # List all available routes
GET /debug/email # Test email service
POST /debug/send-test-email # Send test email
- Python 3.8+ (recommended 3.10+)
- Node.js 16+ and npm
- MongoDB Atlas account or local MongoDB
- Google AI API key
- Gmail account for email services (with app password)
- LangSmith account (optional, for AI tracing)
- Clone the repository:
git clone https://github.com/Arnavkesari/chatbot-for-students-queries.git
cd chatbot-for-students-queries
- Navigate to backend directory:
cd backend
- Create and activate virtual environment:
# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python3 -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
Create
.env
file in the backend directory:
# Database Configuration
MONGO_URI=mongodb+srv://username:password@cluster.mongodb.net/chatbot?retryWrites=true&w=majority
# Google AI Configuration
GOOGLE_API_KEY=your_google_ai_api_key
# LangChain Configuration (Optional - for AI tracing)
LANGSMITH_API_KEY=your_langsmith_api_key
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=Faculty Chatbot
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
# Email Configuration (Gmail SMTP)
EMAIL_USER=your_email@gmail.com
EMAIL_PASS=your_gmail_app_password
# Security
SECRET_KEY=your_secure_random_secret_key
# Flask Configuration
FLASK_ENV=development
PORT=5000
# Admin Credentials
ADMIN_EMAIL=
ADMIN_PASSWORD=
- Add PDF documents to resources folder:
# Place your PDF documents in:
backend/resources/
- Start the backend server:
python app.py
- Navigate to frontend directory:
cd frontend
- Install dependencies:
npm install
- Configure environment variables:
# Create .env.local file
cp .env.local.example .env.local
# Edit .env.local to set the API base URL
# For development:
VITE_API_BASE_URL=http://localhost:5000
# For production (when deploying):
# VITE_API_BASE_URL=https://your-backend-url.onrender.com
- Start the development server:
npm run dev
The frontend will run on http://localhost:5173
chatbot-for-students-queries/
βββ backend/
β βββ app.py # Main Flask application
β βββ requirements.txt # Python dependencies
β βββ config/
β β βββ config.py # Configuration settings
β β βββ database.py # Database connection
β βββ controllers/ # Business logic controllers
β βββ middleware/ # Custom middleware
β βββ models/ # Database models
β βββ routes/ # API route definitions
β βββ services/ # Business services
β βββ utils/ # Utility functions
β βββ resources/ # PDF documents for RAG
βββ frontend/
β βββ src/
β β βββ components/ # React components
β β βββ assets/ # Static assets
β β βββ lib/ # Utility libraries
β βββ package.json # Frontend dependencies
β βββ vite.config.js # Vite configuration
βββ RAG/ # RAG implementation notes
{
"_id": "ObjectId",
"username": "string",
"email": "string",
"password": "bcrypt_hashed_string",
"created_at": "datetime",
"last_login": "datetime"
}
{
"_id": "ObjectId",
"question": "string",
"answer": "string",
"answered": "boolean",
"user_id": "ObjectId",
"timestamp": "datetime",
"ai_generated": "boolean",
"similarity_score": "float",
"category": "string"
}
- Google AI (Gemini): Powers intelligent response generation
- LangChain: Handles RAG pipeline and document processing
- FAISS: Vector database for semantic similarity search
- LangSmith: Optional AI tracing and monitoring
- Gmail SMTP integration for notifications
- Flask-Mail for email management
- Automated alerts for unanswered queries
- Bcrypt password hashing with salt rounds
- JWT token authentication with expiration
- Admin role-based access control
- Environment variable protection
- CORS middleware for secure API access
- Session timeout management
- View system statistics and analytics
- Manage unanswered queries
- Add manual responses to queries
- Delete inappropriate queries
- Monitor chat history across all users
- Receive email notifications for critical queries
- Development:
FLASK_ENV=development
(Hot reloading enabled) - Production:
FLASK_ENV=production
(Optimized for performance) - Testing:
FLASK_ENV=testing
(Test configuration)
- Health Check:
GET http://localhost:5000/health
- Email Service:
GET http://localhost:5000/debug/email
- Send Test Email:
POST http://localhost:5000/debug/send-test-email
- List Routes:
GET http://localhost:5000/debug/routes
# Test user signup
curl -X POST http://localhost:5000/api/signup \
-H "Content-Type: application/json" \
-d '{"username":"testuser","email":"test@example.com","password":"password123"}'
# Test query submission
curl -X POST http://localhost:5000/api/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{"question":"What are the admission requirements?"}'
The system uses Retrieval-Augmented Generation to provide accurate, context-aware responses:
- Document Processing: PDF files in
resources/
are processed and chunked - Vector Embeddings: Text chunks are converted to vectors using Google AI
- Vector Storage: FAISS stores embeddings for fast similarity search
- Query Processing: User questions are embedded and matched against knowledge base
- Response Generation: Relevant context + user query sent to Google AI for response generation
- Upload PDF files through the admin dashboard
- Rebuild embeddings using the dedicated button
- System automatically processes and indexes new content
The backend is configured for deployment on Render.com:
-
Render.yaml Configuration: The project includes a
render.yaml
file with all the necessary configuration for deployment on Render. -
Environment Variables: Make sure to set up all required environment variables in the Render Dashboard:
MONGODB_URI
JWT_SECRET_KEY
GOOGLE_API_KEY
- All email configuration variables
- Cloudinary credentials
- Pinecone API key and environment
-
CORS Configuration: The backend is configured to accept requests from the frontend domain. Make sure to update the
CORS_ALLOW_ORIGIN
environment variable with your frontend URL.
- Build the frontend:
cd frontend
npm run build
- Environment Variables: Set the
VITE_API_BASE_URL
to your deployed backend URL:
VITE_API_BASE_URL=https://your-backend-url.onrender.com
- Deploy: Use Vercel, Netlify, or any static hosting service to deploy the
dist
folder.
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for new functionality
- Commit changes (
git commit -m 'Add amazing feature'
) - Push to branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Follow PEP 8 for Python code
- Use ESLint configuration for JavaScript
- Add docstrings to all functions
- Write tests for new features
- Update README for significant changes
This project is licensed under the MIT License - see the LICENSE file for details.
- Email: sahayak.iiitdmj@gmail.com
- GitHub Issues: Create an issue
- Documentation: See
/docs
folder for detailed documentation
Built with β€οΈ for educational institutions and student success