AI Data Science Team is a Python library of specialized agents for common data science workflows, plus a flagship app: AI Pipeline Studio. The Studio turns your work into a visual, reproducible pipeline, while the AI team handles data loading, cleaning, visualization, and modeling.
Status: Beta. Breaking changes may occur until 0.1.0.
Please ⭐ us on GitHub (it takes 2 seconds and means a lot).
AI Pipeline Studio is the main example of the AI Data Science Team in action.
Highlights:
- Pipeline-first workspace: Visual Editor, Table, Chart, EDA, Code, Model, Predictions, MLflow
- Manual + AI steps with lineage and reproducible scripts
- Multi-dataset handling and merge workflows
- Project saves: metadata-only or full-data
- Storage footprint controls and rehydrate workflows
Run it:
streamlit run apps/ai-pipeline-studio-app/app.pyFull app docs: apps/ai-pipeline-studio-app/README.md
- Python 3.10+
- OpenAI API key (or Ollama for local models)
Clone the repo and install in editable mode:
pip install -e .streamlit run apps/ai-pipeline-studio-app/app.pyThe repository includes both the AI Pipeline Studio app and the underlying AI Data Science Team library. The library provides agent building blocks and multi-agent workflows for:
- Data loading and inspection
- Cleaning, wrangling, and feature engineering
- Visualization and EDA
- Modeling and evaluation (H2O + MLflow tools)
- SQL database interaction
Agent examples live in examples/. Notable agents:
- Data Loader Tools Agent
- Data Wrangling Agent
- Data Cleaning Agent
- Data Visualization Agent
- EDA Tools Agent
- Feature Engineering Agent
- SQL Database Agent
- H2O ML Agent
- MLflow Tools Agent
- Multi-agent workflows (e.g., Pandas Data Analyst, SQL Data Analyst)
- Supervisor Agent (oversees other agents)
- Custom tools for data science tasks
See all apps in apps/. Notable apps:
- AI Pipeline Studio:
apps/ai-pipeline-studio-app/ - EDA Explorer App:
apps/exploratory-copilot-app/ - Pandas Data Analyst App:
apps/pandas-data-analyst-app/
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model_name="gpt-4.1-mini",
)ollama serve
ollama pull llama3.1:8bfrom langchain_ollama import ChatOllama
llm = ChatOllama(
model="llama3.1:8b",
)Want to learn how to build AI agents and AI apps for real data science workflows? Join my next‑gen AI workshop: https://learn.business-science.io/ai-register
