Multi-Agent AutoML Data Analyst (MCP + A2A + Gemini Powered)

Automated Profiler → EDA → AutoML → Verifier → Notebook Synthesizer → Gemini Insights

🚀 Live Demo (Render Deployment): https://multiagent-data-analyst.onrender.com/

Track: Enterprise Agents

Tech: Python, Streamlit, Gemini, MCP Tools, A2A Bus, Multi-Agent Architecture

Overview

The Multi-Agent Data Analyst is a fully automated, end-to-end data analysis pipeline powered by multiple specialized agents working together. It uploads a dataset, analyzes it, builds ML models, verifies the results, generates notebooks, and produces final insights — without any manual coding.

This project demonstrates:

✔ Multi-agent systems

✔ A2A (Agent-to-Agent)

✔ Tool-based agent execution (MCP Tools)

✔ Sessions & memory

✔ Context-aware notebook synthesis

✔ Gemini-powered explanations

✔ Streamlit multi-page application

Problem Statement

Performing data analysis typically requires switching between tools, writing repetitive code, running models manually, validating outputs, and documenting everything.

For beginners, this is overwhelming.

For analysts, it's time-consuming.

For teams, it’s inconsistent.

Goal: Build an agentic system that automates the entire workflow — from raw data to verified insights and notebook generation.

Why Agents?

Agents make the system:

-> Modular — each agent does one job

-> Autonomous — actions happen without the user triggering each step

-> Traceable — every step is observable

-> Composable — agents communicate using A2A bus

-> Extensible — new agents (e.g., Gemini Reviewer) can be added anytime

Instead of one giant notebook, the intelligence is distributed:

Agent Roles

-> Profiler Agent – inspects dataset, finds issues

-> EDA Agent – generates charts, summaries, anomalies

-> Model Agent (AutoML) – builds ML pipelines automatically

-> Verifier Agent – detects inconsistencies, bad models, missing columns

-> Notebook Synthesizer Agent – creates a clean notebook combining all outputs

-> Gemini Agent – explains the ML results in human-friendly language

Each agent writes outputs to memory → A2A orchestrates → Next agent reacts.

Architecture

1️⃣ ProfilerAgent

✔ Reads dataset

✔ Detects column types

✔ Finds missing values

✔ Sends message → EDAAgent

2️⃣ EDAAgent

✔ Creates correlations, histograms, outlier analysis

✔ Saves all plots via MCP FileTools

✔ Sends message → ModelAgent

3️⃣ ModelAgent

✔ Auto-detects task type (classification/regression)

✔ Builds full ML pipeline (imputation + scaling + encoding)

✔ Tunes models

✔ Saves best model

✔ Sends message → VerifierAgent

4️⃣ VerifierAgent

✔ Validates model quality

✔ Computes quality tag (“Good”, “Acceptable”, “Weak”)

✔ Sends message → NotebookAgent

5️⃣ NotebookSynthesizerAgent

✔ Builds a full auto-generated Jupyter Notebook

✔ Embeds all results and images

✔ Saves notebook through FileTools

6️⃣ Gemini Integration

✔ Gemini generates:

✔ Model explanations

✔ Recommendations

✔ Summaries

Plain-English explanations for beginners

7️⃣ Streamlit UI

✔ Beautiful dashboard with:

✔ Dataset Explorer

✔ EDA Dashboard

✔ AutoML Dashboard

✔ Verifier & Notebook Builder

✔ A2A Communications Console

Setup Instructions

Clone Repo

git clone https://github.com/yourusername/multiagent-data-analyst
cd multiagent-data-analyst
pip install -r requirements.txt
Add Gemini API Key
Create .env:
Run Streamlit -> streamlit run streamlit_app/app.py

Demo (Screenshots)

Dataset Upload

EDA Dashboard

AutoML Results

Gemini Explanation

A2A Console

Notebook generated

Profiler Agent Output -

Verifier Agent -

Tools & Technologies Used

🚀Category Tools

🚀Multi-Agent Custom Agents, A2A Bus

🚀LLM Gemini 1.5 Flash

🚀UI Streamlit

🚀ML Scikit-Learn

🚀Storage Custom MemoryTools

🚀Notebook nbformat

🚀Deployment Render

🚀Visualization Plotly, Matplotlib, Seaborn

🚀 LLM Gemini 1.5 Flash

🗂 Project Structure

multiagent-data-analyst/

│

├── src/

│ ├── agents/

│ ├── core/

│ ├── tools/

│ │ ├── file_tools.py

│ │ ├── dataset_tools.py

│ │ ├── memory_tools.py

│ │ ├── model_tools.py

│ │ └── notebook_tools.py

│

├── streamlit_app/

│ ├── app.py

│ └── pages/

│ ├── AutoML.py

│ ├── Profiler.py

│ ├── EDA_Dashboard.py

│ ├── Notebook_Report.py

│ ├── Verifier.py

│ └── A2A_Dashboard.py

│

├── streamlit_app_storage/

│ ├── memory/

│ ├── uploads/

│ └── reports/

│

└── README.md

Future Improvements

Add RAG-based “Data Question Answering Agent”
Add deployment on Google Cloud Run using Docker
Add Evaluation Agent for model fairness
Provide more AutoML models (XGBoost, LightGBM)
Add voice-based interaction mode

Credits

Built by Vaishnavi Sharma as part of Google x Kaggle – Agents Intensive

If you find this useful, ⭐ star the repo!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
mcp		mcp
memory		memory
project_storage		project_storage
reports		reports
src		src
streamlit_app		streamlit_app
streamlit_app_storage		streamlit_app_storage
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Agent AutoML Data Analyst (MCP + A2A + Gemini Powered)

Overview

Problem Statement

Goal: Build an agentic system that automates the entire workflow — from raw data to verified insights and notebook generation.

Why Agents?

Agent Roles

Each agent writes outputs to memory → A2A orchestrates → Next agent reacts.

Architecture

Setup Instructions

Demo (Screenshots)

Dataset Upload

EDA Dashboard

AutoML Results

Gemini Explanation

A2A Console

Notebook generated

Profiler Agent Output -

Verifier Agent -

Tools & Technologies Used

🗂 Project Structure

multiagent-data-analyst/

Future Improvements

Credits

If you find this useful, ⭐ star the repo!

About

Uh oh!

Releases

Packages

Languages

VaishnaviSh14/MultiAgent-Data-Analyst

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent AutoML Data Analyst (MCP + A2A + Gemini Powered)

Overview

Problem Statement

Goal: Build an agentic system that automates the entire workflow — from raw data to verified insights and notebook generation.

Why Agents?

Agent Roles

Each agent writes outputs to memory → A2A orchestrates → Next agent reacts.

Architecture

Setup Instructions

Demo (Screenshots)

Dataset Upload

EDA Dashboard

AutoML Results

Gemini Explanation

A2A Console

Notebook generated

Profiler Agent Output -

Verifier Agent -

Tools & Technologies Used

🗂 Project Structure

multiagent-data-analyst/

Future Improvements

Credits

If you find this useful, ⭐ star the repo!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages