Heart-Disease-Prediction-Using-PCA 🫀

Description:

Utilizing Principal Component Analysis (PCA) for insightful feature reduction and predictive modeling, this GitHub repository offers a comprehensive approach to forecasting heart disease risks. Explore detailed data analysis, PCA implementation, and machine learning algorithms to predict and understand factors contributing to heart health.

Medical Phenomenon 🩺

Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year. CVDs are a group of disorders of the heart and blood vessels and include coronary heart disease, cerebrovascular disease, rheumatic heart disease and other conditions. Four out of 5CVD deaths are due to heart attacks and strokes, and one third of these deaths occur prematurely in people under 70 years of age. Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.

Individuals at risk of CVD may demonstrate raised blood pressure, glucose, and lipids as well as overweight and obesity. These can all be easily measured in primary care facilities. Identifying those at highest risk of CVDs and ensuring they receive appropriate treatment can prevent premature deaths. Access to essential noncommunicable disease medicines and basic health technologies in all primary health care facilities is essential to ensure that those in need receive treatment and counselling.

Dataset

The dataset contains medical records of 304 patients who had heart failure, collected during their follow-up period, where each patient profile has 12 clinical features. https://github.com/PraveenHurakadli/Heart-Disease-Prediction-Using-PCA/blob/main/heart.csv

Attributes Information:

Attribute	Description
Age	Age of a patient [years]
Sex	Gender of the patient [M: Male, F: Female]
ChestPain	Chest pain type [TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic]
RestingBP	Blood pressure in Hg (Normal blood pressure - 120/80 Hg)
Cholesterol	Serum cholestrol level in blood (Normal cholesterol level below for adults 200mg/dL)
FastingBS	Fasting Blood Sugar (Normal less than 100mg/dL for non diabetes for diabetes 100-125mg/dL)
RestingECG	Resting electrocardiogram results [Normal: Normal, ST: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), LVH: showing probable or definite left ventricular hypertrophy by Estes' criteria]
MaxHR	Maximum heart rate achieved [Numeric value between 60 and 202]
ExerciseAngina	Exercise-induced angina [Y: Yes, N: No]
Oldpeak	oldpeak = ST [Numeric value measured in depression]
ST_Slope	The slope of the peak exercise ST segment [Up: upsloping, Flat: flat, Down: downsloping]
HeartDisease	output class [1: heart disease, 0: Normal]

Procedural Overview

Here is the overview of the procedural steps to perform Principal Component Analysis (PCA) using machine learning algorithms for a heart disease prediction project.

Steps:

1. Data Collection and Preprocessing:

Gather a dataset containing relevant features related to heart health (e.g., age, blood pressure, cholesterol levels, etc.). Handle missing values, encode categorical variables, and normalize/standardize the data.

2. Exploratory Data Analysis (EDA):

Perform descriptive statistics, visualizations, and correlation analysis to understand the dataset. Assess feature importance and relationships to gain insights.

3. Feature Selection and PCA:

Identify features relevant for predicting heart disease. Apply PCA to reduce the dimensionality of the dataset while retaining important information. Determine the number of principal components to keep (using variance explained or scree plot).

4. Split Data for Training and Testing:

Divide the dataset into training and testing sets (e.g., 70-30 or 80-20 split).

5. Model Selection and Training:

Choose appropriate machine learning algorithms (e.g., Logistic Regression, Random Forest, SVM) for classification. Fit the model on the training data.

6. Model Evaluation:

Evaluate the model's performance using the testing data (accuracy, precision, recall, F1-score, ROC curve, etc.). Use cross-validation to assess model robustness.

7. Tuning and Optimization:

Fine-tune hyperparameters of the models to improve performance (e.g., GridSearchCV or RandomizedSearchCV).

8. Prediction and Interpretation:

Make predictions on new/unseen data using the trained model. Interpret the results and assess the factors contributing to heart disease prediction.

Performance Results

KNN model gives an accuracy of : 87%

Random forest gives an accuracy of : 86%

Suport Vector Classifier gives an accuracy of : 86%

Gradient Boosting Classifier gives an accuracy of: 82%

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Heart Disease Predition.ipynb		Heart Disease Predition.ipynb
PCA.pdf		PCA.pdf
Problem Statement.ipynb		Problem Statement.ipynb
README.md		README.md
heart.csv		heart.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart-Disease-Prediction-Using-PCA 🫀

Medical Phenomenon 🩺

Dataset

Attributes Information:

Procedural Overview

Steps:

1. Data Collection and Preprocessing:

2. Exploratory Data Analysis (EDA):

3. Feature Selection and PCA:

4. Split Data for Training and Testing:

5. Model Selection and Training:

6. Model Evaluation:

7. Tuning and Optimization:

8. Prediction and Interpretation:

Performance Results

About

Releases

Packages

Languages

PraveenHurakadli/Heart-Disease-Prediction-Using-PCA

Folders and files

Latest commit

History

Repository files navigation

Heart-Disease-Prediction-Using-PCA 🫀

Medical Phenomenon 🩺

Dataset

Attributes Information:

Procedural Overview

Steps:

1. Data Collection and Preprocessing:

2. Exploratory Data Analysis (EDA):

3. Feature Selection and PCA:

4. Split Data for Training and Testing:

5. Model Selection and Training:

6. Model Evaluation:

7. Tuning and Optimization:

8. Prediction and Interpretation:

Performance Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages