Skip to content

hmm29/machine-learning-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Projects

This repository contains a collection of machine learning projects focusing on data preprocessing and visualization techniques.

Project Structure

  • preprocessing.ipynb: Contains data preprocessing techniques including:

    • Data loading and exploration
    • Min-Max scaling with sklearn
    • Standardization (0 mean, 1 standard deviation)
  • plots.ipynb: Contains data visualization techniques including:

    • Basic data shape and statistics
    • Skewness analysis
    • Histograms for feature distributions
  • feature_selection.ipynb: Contains feature extraction techniques including:

    • Univariate Feature Selection
    • Recursive Feature Elimination (RFE)
    • Principal Component Analysis (PCA)

Dataset

The project uses the Pima Indians Diabetes Dataset, which includes several health metrics and a binary classification for diabetes. The dataset has the following features:

  • 'preg': Number of pregnancies
  • 'plas': Plasma glucose concentration
  • 'pres': Blood pressure
  • 'skin': Skin thickness
  • 'test': Insulin level
  • 'mass': BMI
  • 'pedi': Diabetes pedigree function
  • 'age': Age
  • 'class': Binary outcome (diabetes or not)

Getting Started

Prerequisites

To run these notebooks, you'll need:

  • Python 3.x
  • Jupyter Notebook
  • Required packages: pandas, numpy, sklearn, matplotlib

Running the Notebooks

  1. Clone this repository
  2. Install the required packages:
    pip install pandas numpy scikit-learn matplotlib jupyter
    
  3. Start Jupyter Notebook:
    jupyter notebook
    
  4. Open and run the notebooks in your browser

License

This project is open source and available under the MIT License.

Contact

Harrison Miller - GitHub Profile

About

Tiny ML algorithm experiments with Sci-Py

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published