Automated Profiler → EDA → AutoML → Verifier → Notebook Synthesizer → Gemini Insights
🚀 Live Demo (Render Deployment): https://multiagent-data-analyst.onrender.com/
Track: Enterprise Agents
Tech: Python, Streamlit, Gemini, MCP Tools, A2A Bus, Multi-Agent Architecture
The Multi-Agent Data Analyst is a fully automated, end-to-end data analysis pipeline powered by multiple specialized agents working together. It uploads a dataset, analyzes it, builds ML models, verifies the results, generates notebooks, and produces final insights — without any manual coding.
This project demonstrates:
✔ Multi-agent systems
✔ A2A (Agent-to-Agent)
✔ Tool-based agent execution (MCP Tools)
✔ Sessions & memory
✔ Context-aware notebook synthesis
✔ Gemini-powered explanations
✔ Streamlit multi-page application
Performing data analysis typically requires switching between tools, writing repetitive code, running models manually, validating outputs, and documenting everything.
For beginners, this is overwhelming.
For analysts, it's time-consuming.
For teams, it’s inconsistent.
Goal: Build an agentic system that automates the entire workflow — from raw data to verified insights and notebook generation.
Agents make the system:
-> Modular — each agent does one job
-> Autonomous — actions happen without the user triggering each step
-> Traceable — every step is observable
-> Composable — agents communicate using A2A bus
-> Extensible — new agents (e.g., Gemini Reviewer) can be added anytime
Instead of one giant notebook, the intelligence is distributed:
-> Profiler Agent – inspects dataset, finds issues
-> EDA Agent – generates charts, summaries, anomalies
-> Model Agent (AutoML) – builds ML pipelines automatically
-> Verifier Agent – detects inconsistencies, bad models, missing columns
-> Notebook Synthesizer Agent – creates a clean notebook combining all outputs
-> Gemini Agent – explains the ML results in human-friendly language
1️⃣ ProfilerAgent
✔ Reads dataset
✔ Detects column types
✔ Finds missing values
✔ Sends message → EDAAgent
2️⃣ EDAAgent
✔ Creates correlations, histograms, outlier analysis
✔ Saves all plots via MCP FileTools
✔ Sends message → ModelAgent
3️⃣ ModelAgent
✔ Auto-detects task type (classification/regression)
✔ Builds full ML pipeline (imputation + scaling + encoding)
✔ Tunes models
✔ Saves best model
✔ Sends message → VerifierAgent
4️⃣ VerifierAgent
✔ Validates model quality
✔ Computes quality tag (“Good”, “Acceptable”, “Weak”)
✔ Sends message → NotebookAgent
5️⃣ NotebookSynthesizerAgent
✔ Builds a full auto-generated Jupyter Notebook
✔ Embeds all results and images
✔ Saves notebook through FileTools
6️⃣ Gemini Integration
✔ Gemini generates:
✔ Model explanations
✔ Recommendations
✔ Summaries
Plain-English explanations for beginners
7️⃣ Streamlit UI
✔ Beautiful dashboard with:
✔ Dataset Explorer
✔ EDA Dashboard
✔ AutoML Dashboard
✔ Verifier & Notebook Builder
✔ A2A Communications Console
Clone Repo
-
git clone https://github.com/yourusername/multiagent-data-analyst
-
cd multiagent-data-analyst
-
pip install -r requirements.txt
-
Add Gemini API Key
-
Create .env:
-
Run Streamlit -> streamlit run streamlit_app/app.py
🚀Category Tools
🚀Multi-Agent Custom Agents, A2A Bus
🚀LLM Gemini 1.5 Flash
🚀UI Streamlit
🚀ML Scikit-Learn
🚀Storage Custom MemoryTools
🚀Notebook nbformat
🚀Deployment Render
🚀Visualization Plotly, Matplotlib, Seaborn
🚀 LLM Gemini 1.5 Flash
│
├── src/
│ ├── agents/
│ ├── core/
│ ├── tools/
│ │ ├── file_tools.py
│ │ ├── dataset_tools.py
│ │ ├── memory_tools.py
│ │ ├── model_tools.py
│ │ └── notebook_tools.py
│
├── streamlit_app/
│ ├── app.py
│ └── pages/
│ ├── AutoML.py
│ ├── Profiler.py
│ ├── EDA_Dashboard.py
│ ├── Notebook_Report.py
│ ├── Verifier.py
│ └── A2A_Dashboard.py
│
├── streamlit_app_storage/
│ ├── memory/
│ ├── uploads/
│ └── reports/
│
└── README.md
-
Add RAG-based “Data Question Answering Agent”
-
Add deployment on Google Cloud Run using Docker
-
Add Evaluation Agent for model fairness
-
Provide more AutoML models (XGBoost, LightGBM)
-
Add voice-based interaction mode
Built by Vaishnavi Sharma as part of Google x Kaggle – Agents Intensive