Skip to content

Latest commit

 

History

History
143 lines (102 loc) · 8.92 KB

README.md

File metadata and controls

143 lines (102 loc) · 8.92 KB

Machine Learning Process

The machine learning process involves the following steps:

  • 1- Data Preparation: Collect, clean, and preprocess data.
  • 2- Data Visualization and Analysis: Visualize and analyze data to identify patterns and relationships.
  • 3- Feature Engineering: Select and transform relevant variables in the data.
  • 4- Model Selection: Choose the best model for the problem.
  • 5- Model Training: Feed data into the model and adjust parameters to minimize error.
  • 6- Hyperparameter Tuning: Set hyperparameters to optimize model performance.
  • 7- Model Evaluation: Measure accuracy, precision, recall, and other performance metrics.
  • 8- Model Deployment: Integrate the model into an application and set up a pipeline to feed new data.

Machine Learning Tutorial

This tutorial covers Machine Learning Basics using Python.

The repository includes Python notebooks, reference guides, and cheatsheets for the entire Machine Learning process:

  • 1- Data preprocessing and analysis: clean and transform data into a format suitable for analysis using NumPy and Pandas.
  • 2- Data visualization: understand and explore data visually using Matplotlib and Seaborn.
  • 3- Machine learning: explore various algorithms in Scikit-learn such as regression, classification, and clustering.
  • 4- Feature engineering: feature encoding, feature scaling, feature selection, etc.
  • 5- Model selection: comparison of ML algorithms, how to choose a ML algorithm, etc.
  • 6- Hyperparameters tuning: Grid Search, Random Search, and Bayesian Optimization.
  • 7- Model evaluation: validation methods, evaluation metrics, etc.
  • 8- Model explainability: feature importance, interpretable models, etc.

The repository also includes two Python notebooks of two popular examples to get started with Machine Learning:

  • Classification - Titanic Survival Prediction: Predict whether a passenger on the Titanic ship survived or not based on various features such as their age, gender, ticket class, and cabin location (notebook).
  • Regression - Boston House Price Prediction: Predict the median value of houses in Boston neighborhoods based on various features such as crime rate, number of rooms, proximity to employment centers, and accessibility to highways (notebook).

The end of the GitHub repository provides resources and links to practice and advance with Machine Learning:

  • The most popular ML dataset platforms.
  • The most popular ML competition platforms.
  • A guide to tackle ML competitions (PDF).

Requirements

Tools:

  • Python 3
  • Jupyter Notebook
  • Google Colab

Concepts:

Structure of the tutorial

  • 1-   Machine learning basic concepts
  • 2-   Read input data in Python
  • 3-   Data preprocessing and analysis: Numpy and Pandas
  • 4-   Data visualization: Matplotlib and Seaborn
  • 5-   Machine learning: Scikit-learn
  • 6-   Feature engineering
  • 7-   Model selection and parameter tuning
  • 8-   Model evaluation and explainability
  • 9-   Practice: Machine learning datasets
  • 10- Practice: Machine learning competitions

Content of the tutorial

1- Machine learning basic concepts

  • Presentation on Machine learning basic concepts (PDF)

2- Read input data in Python

  • Tutorial to read various sources in a DataFrame (notebook)

3- Data preprocessing and analysis: Numpy and Pandas

  • Numpy cheatsheet (PDF)
  • Pandas cheatsheet (PDF)
  • Numpy and Pandas tutorial (notebook)

4- Data visualization: Matplotlib and Seaborn

  • Chart chooser (PDF)
  • Matplotlib cheatsheet (PDF)
  • Matplotlib tutorial (WEB)
  • Seaborn tutorial (WEB)

5- Machine learning: Scikit-learn

  • Machine learning map (PDF)
  • Scikit-learn cheatsheet (PDF)
  • Scikit-learn tutorial (notebook)
  • Classification: Titanic Survival Prediction (notebook)
  • Regression: Boston House Price Prediction (notebook)

6- Feature engineering

  • Feature engineering cheatsheet (PDF)
  • Feature engineering tutorial (notebook)
  • Feature selection methods (IMG)

7- Model selection and parameter tuning

  • Comparison of ML algorithms 1 (PDF)
  • Comparison of ML algorithms 2 (IMG)
  • How to choose a ML algorithm (IMG)
  • Hyperparameter tuning (WEB)

8- Model evaluation and explainability

  • Evaluation metrics cheatsheet (PDF)
  • Evaluation metrics in Python (WEB)
  • Model explainability cheatsheet (PDF)

9- Practice: Machine learning datasets

10- Practice: Machine learning competitions