An intelligent, interactive chatbot that enables users to perform data analytics, visualization, and machine learning tasks using natural language queries—powered by Google’s Gemini 1.5 Flash and integrated with tools like Pandas, Seaborn, Scikit-learn, and Streamlit.
- Project Description
- Key Features
- Tech Stack
- Installation
- Usage
- Project Structure
- Future Enhancements
- References
This AI-powered chatbot aims to assist students, data enthusiasts, and analysts in exploring data, performing EDA, building ML models, and generating insights instantly from natural prompts. It simplifies data science workflows by integrating Generative AI and robust Python libraries for automated, conversational analysis.
- 📂 Upload & Preview Datasets: Upload data and instantly preview rows and columns.
- 📊 EDA Capabilities: Perform statistical summaries, correlation checks, and detect outliers.
- 📈 Interactive & Static Visualizations: Supports violin plots, scatterplots, boxplots, and more via Plotly, Matplotlib, and Seaborn.
- 📉 Statistical Analysis: Includes descriptive stats, hypothesis testing, and distribution analysis.
- 🧠 ML Model Assistance: Build supervised/unsupervised models with evaluation metrics like accuracy and classification reports.
- 🗣️ Gemini AI Integration: Use natural queries to get human-like, intelligent responses powered by Google’s Gemini 1.5 Flash.
- 🧾 Regex-based Query Parsing: Understands structured queries using regex.
- 🖼️ PIL & Base64 Plot Rendering: Renders charts within Streamlit with format compatibility.
- Languages: Python
- Libraries:
numpy
,pandas
,seaborn
,matplotlib
,plotly
,scikit-learn
,scipy
,regex
,base64
,io
,PIL
- AI Model: Google Gemini 1.5 Flash
- Framework: Streamlit
- IDE: Visual Studio Code / PyCharm
- Python 3.9+
- pip
# 1. Clone the repo
git clone https://github.com/yourusername/ai-data-analytics-chatbot.git
cd ai-data-analytics-chatbot
# 2. Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Run the Streamlit app
streamlit run app.py
-
Upload your dataset (CSV format).
-
Ask a question like:
- "Show me a correlation matrix"
- "Build a decision tree classifier"
- "Give me a boxplot of age vs income"
-
Let the bot process the request and provide analytics or visual outputs.
Data Analytics Chatbot/
├── streamlit/
│ └── secrets.toml
├── assets/
├── static_plots/
├── venv/
├── app.py
├── chatbot_logic.py
├── gemini_handler.py
├── utils.py
└── requirements.txt
- 💾 Local memory persistence (e.g., JSON/SQLite)
- 🔐 User authentication & role-based access
- 🧠 Semantic query understanding (LangChain)
- 📊 Autosuggest visualizations from data patterns
- 📤 Export chats as PDF/Excel
- 🧠 Voice command prompts
- 🤖 Integration of advanced ML/DL models
If you find this helpful, feel free to star ⭐ the repository and fork 🍴 it to enhance further. Pull requests are welcome!