This notebook uses machine learning to predict if a patient has heart disease or not. Ten models were created and compared including an ensemble. The best model was able to predict with a F1 score of .91. This notebook is hosted on Kaggle and can be found here: https://www.kaggle.com/code/jarredpriester/heart-disease-predictions-using-a-ml-ensemble-in-r
First, this paper is my final paper for my Harvard Data Science Professional Certificate. Second, practice makes perfect! Third, I would like to know more about heart disease and how to prevent it, if possible.
This project allowed me to take everything I have learned in my Harvard classes and put it all together in a data science project. I gained more experience with R and the caret library by using 10 models in this project. I learned that the type of chest pain, number of major vessels cleared, and the type of defects were the most important features when trying to predict heart disease.
We will be analyzing the heart disease dataset from University of California Irvine Machine Learning Repository. This dataset consists of 14 different features and 303 observations.
Harvard Final Paper
Heart_Disease_Prediction_Project.R - R script
Heart_Disease_Prediction_Project.pdf - PDF
Kaggle R Notebook
Heart_Disease_Kaggle.Rmd - R markdown
Heart_Disease_Kaggle.pdf - PDF