Heart Disease Prediction and Analysis

Overview

This repository contains a project focused on heart disease prediction. The data, derived from heart patients, includes various health metrics such as age, blood pressure, heart rate, and more. The primary objective is to create a predictive model that accurately identifies individuals at risk of heart disease. The emphasis is on achieving a high recall to ensure no potential heart disease case is missed.

Problem

In this project, we delve into a dataset encapsulating various health metrics from heart patients, including age, blood pressure, heart rate, and more. Our goal is to develop a predictive model capable of accurately identifying individuals with heart disease. Given the grave implications of missing a positive diagnosis, our primary emphasis is on ensuring that the model identifies all potential patients, making recall for the positive class a crucial metric.

Objectives

The objectives of the project are as follows:

Data Understanding: Familiarize ourselves with the dataset and its features.
Exploratory Data Analysis (EDA): Unveil patterns, trends, and relationships between different variables.
- Univariate Analysis
- Bivariate Analysis
Data Preprocessing: Prepare the data for future machine learning tasks.
- Remove irrelevant features
- Address missing values
- Treat outliers
- Encode categorical variables
- Transform skewed features to achieve normal-like distributions
Model Building: Develop and refine the prediction models.
- Establish pipelines for models that require scaling
- Implement and tune classification models including KNN, SVM, Decision Tree, and Random Forest
- Emphasize achieving high recall for class 1, ensuring comprehensive identification of heart patients
Evaluate and Compare Model Performance: Utilize precision, recall, and F1-score to gauge models' effectiveness.

Dataset

The dataset comprises various metrics related to heart health. The features of the dataset are described in the table below:

Variable Name	Description
age	Age of the patient in years
sex	Gender of the patient (0 = male, 1 = female)
cp	Chest pain type: 0: Typical angina 1: Atypical angina 2: Non-anginal pain 3: Asymptomatic
trestbps	Resting blood pressure in mm Hg
chol	Serum cholesterol in mg/dl
fbs	Fasting blood sugar level, categorized as above 120 mg/dl (1 = true, 0 = false)
restecg	Resting electrocardiographic results: 0: Normal 1: Having ST-T wave abnormality 2: Showing probable or definite left ventricular hypertrophy
thalach	Maximum heart rate achieved during a stress test
exang	Exercise-induced angina (1 = yes, 0 = no)
oldpeak	ST depression induced by exercise relative to rest
slope	Slope of the peak exercise ST segment: 0: Upsloping 1: Flat 2: Downsloping
ca	Number of major vessels (0-4) colored by fluoroscopy
thal	Thalium stress test result: 0: Normal 1: Fixed defect 2: Reversible defect 3: Not described
target	Heart disease status (0 = no disease, 1 = presence of disease)

You can find the dataset here.

File Descriptions

Heart Disease Prediction.ipynb: Jupyter notebook containing all the data exploration, visualization, modeling, and evaluation code.
heart.csv: CSV file containing the heart disease data.
README.md: This file, providing an overview of the project.

How to Run

Clone this repository.
Open the Heart Disease Prediction.ipynb notebook in Jupyter.
Run all cells in the notebook.

Additional Resources

For those interested in exploring this notebook in a Kaggle environment, you can access it here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Disease Prediction and Analysis

Overview

Problem

Objectives

Dataset

File Descriptions

How to Run

Additional Resources

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Heart Disease Prediction.ipynb		Heart Disease Prediction.ipynb
LICENSE.txt		LICENSE.txt
README.md		README.md
heart.csv		heart.csv
image.jpg		image.jpg

License

FarzadNekouee/Heart_Disease_Prediction

Folders and files

Latest commit

History

Repository files navigation

Heart Disease Prediction and Analysis

Overview

Problem

Objectives

Dataset

File Descriptions

How to Run

Additional Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages