Skip to content

A robust Python tool for comprehensive dataset analysis and machine learning model evaluation. This project automates the process of data preprocessing, exploratory data analysis (EDA), and predictive modeling, with a focus on handling common data inconsistencies.

License

Notifications You must be signed in to change notification settings

Takk8IS/DatasetAnalysisEDA

Repository files navigation

Dataset Analysis EDA 📊

Version Licence GitHub issues GitHub stars

Dataset Analysis EDA is a Python-based tool designed for comprehensive exploratory data analysis (EDA) and machine learning model evaluation. This intelligent system processes various dataset formats, performs data preprocessing, conducts statistical analysis, and generates insightful visualizations.

Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA Dataset Analysis EDA

🌟 Key Features

  • 📄 Multi-format Data Processing: Handle various file formats including CSV and Excel.
  • 🧹 Automated Data Preprocessing: Includes grammar correction, handling of missing values, and feature encoding.
  • 📊 Comprehensive EDA: Generates statistical summaries, correlation analyses, and various visualizations.
  • 🤖 Machine Learning Model Evaluation: Implements Random Forest classification with cross-validation.
  • 📈 Feature Importance Analysis: Provides insights into the most influential features in the dataset.
  • 📉 Advanced Visualizations: Includes histograms, heatmaps, confusion matrices, and feature importance plots.
  • 🛠️ Robust Error Handling: Comprehensive error management to ensure smooth operation with various datasets.

📦 Project Structure

├── AUTHORS.md
├── DatasetAnalysis.py
├── FUNDING.yml
├── INFO.md
├── LICENSE.md
├── PRIVACY.md
├── PlanilhaModelagem.csv
├── PlanilhaModelagem.xlsx
├── README.md
├── images
│   ├── screenshot-01.png
│   ├── screenshot-02.png
│   ├── screenshot-03.png
│   ├── screenshot-04.png
│   ├── screenshot-05.png
│   ├── screenshot-06.png
│   ├── screenshot-07.png
│   ├── screenshot-08.png
│   ├── screenshot-09.png
│   ├── screenshot-10.png
│   ├── screenshot-11.png
│   ├── screenshot-12.png
│   ├── screenshot-13.png
│   └── screenshot-14.png
└── requirements.txt

🏃‍♂️ How to Use

  1. Clone the Repository:

    git clone https://github.com/Takk8IS/DatasetAnalysisEDA.git
    cd DatasetAnalysisEDA
  2. Install Dependencies:

    pip install -r requirements.txt
  3. Run the Analysis:

    python DatasetAnalysis.py PlanilhaModelagem.xlsx
  4. Review the Results:

    • The script will generate various plots and print analysis results in the console.
    • Review the generated visualizations for insights about your dataset.

Contributing

We welcome contributions from the community! If you'd like to contribute, please:

  1. Fork the repository.
  2. Create your feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

Donations

If this project has been helpful, consider making a donation:

USDT (TRC-20): TGpiWetnYK2VQpxNGPR27D9vfM6Mei5vNA

Your support helps us continue to develop innovative data analysis tools.

License

This project is licensed under the CC-BY-4.0 License. See the LICENSE file for more details.

About Takk™ Innovate Studio

Leading the Digital Revolution as the Pioneering 100% Artificial Intelligence Team.

About

A robust Python tool for comprehensive dataset analysis and machine learning model evaluation. This project automates the process of data preprocessing, exploratory data analysis (EDA), and predictive modeling, with a focus on handling common data inconsistencies.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published

Languages