Data Analysis Assistant is an advanced, interactive, and modular data-science tool built using Streamlit. It enables users to upload datasets, clean data, explore insights, create visualizations, engineer features, train ML models, and export reports—all inside a single unified dashboard.
This project is designed to give a Power BI–style experience within Python, focusing on:
Modular utilities for maintainability
Interactive dashboards
Automated EDA + visualizations
ML model building & evaluation
Data export options
Real-time insights
Whether you're a data science beginner or building production-grade applications, this tool provides a smooth, guided workflow from raw data → insights → ML predictions.
✨ Features 🔹 1. Dataset Upload & Overview
Supports CSV, Excel, and other tabular formats
Displays dataset preview, shape, datatypes
Column-level statistics
🔹 2. Data Cleaning Module
Handle missing values
Outlier detection
Duplicate handling
Label encoding & Standard scaling
🔹 3. Exploratory Data Analysis (EDA)
Automated profiling
Correlation heatmaps
Distribution plots
Summary insights
🔹 4. Visualization Dashboard
Multiple chart types (bar, line, scatter, heatmaps, boxplots)
Customizable parameters
High-quality Matplotlib/Seaborn charts
🔹 5. Feature Engineering
Train-test split
Class weighting
Feature selection/encoding
🔹 6. Machine Learning Module
Supports classification & regression models
Accuracy, F1, Recall, Precision
MSE, RMSE, MAE
Confusion matrix & classification reports
🔹 7. Export Manager
Export cleaned dataset
Export ML model
Export EDA report
Power BI pipeline integration