The goal of this project was to determine a small set of measurements that are highly predictive of a penguin's species.
The machine learning models were trained and evaluated on the Palmer Penguins data set, which was collected by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network. The CSV data contains measurements on three penguin species: Chinstrap, Gentoo, and Adelie.
- Exploratory Data Analysis
- Modeling
- Logistic regression and cross validation were used for feature selection
- Model 1: Multinomial Logistic Regression
- Model 2: Decision Tree Classifier
- Model 3: Support Vector Machine