Skip to content

Itachi4500/TY-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 

Repository files navigation

Data Analysis Assistant is an advanced, interactive, and modular data-science tool built using Streamlit. It enables users to upload datasets, clean data, explore insights, create visualizations, engineer features, train ML models, and export reports—all inside a single unified dashboard.

This project is designed to give a Power BI–style experience within Python, focusing on:

Modular utilities for maintainability

Interactive dashboards

Automated EDA + visualizations

ML model building & evaluation

Data export options

Real-time insights

Whether you're a data science beginner or building production-grade applications, this tool provides a smooth, guided workflow from raw data → insights → ML predictions.

✨ Features 🔹 1. Dataset Upload & Overview

Supports CSV, Excel, and other tabular formats

Displays dataset preview, shape, datatypes

Column-level statistics

🔹 2. Data Cleaning Module

Handle missing values

Outlier detection

Duplicate handling

Label encoding & Standard scaling

🔹 3. Exploratory Data Analysis (EDA)

Automated profiling

Correlation heatmaps

Distribution plots

Summary insights

🔹 4. Visualization Dashboard

Multiple chart types (bar, line, scatter, heatmaps, boxplots)

Customizable parameters

High-quality Matplotlib/Seaborn charts

🔹 5. Feature Engineering

Train-test split

Class weighting

Feature selection/encoding

🔹 6. Machine Learning Module

Supports classification & regression models

Accuracy, F1, Recall, Precision

MSE, RMSE, MAE

Confusion matrix & classification reports

🔹 7. Export Manager

Export cleaned dataset

Export ML model

Export EDA report

Power BI pipeline integration