Skip to content

We'll make a robust classification model which will classify whether or not a person will have cardiovascular risk within 10 years.

Notifications You must be signed in to change notification settings

SirajShaikh96/Cardiovascular-Risk-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Cardiovascular-Risk-Prediction

heart

📋 Project Summary

As the cardio vascular risk is increase year by year. from the given data we've to predict whether or not a person will have cardio vascular risk within 10 years.

We are gonna do some eda to visaulize the dataset and will fnd some insight from eda, check for missing value, distribution of the features, target feature is balanced or not. Capping outliers, Balacing the target feature by SMOTE

After EDA, feature engineering and feature selection we'll split the dataset into training and testing set. Then we'll train multiple classification models, Hyperparameter tunning, evaluating model on the basis of classification metrics.

Finalizing the metric and best model on the basis of selected metric.

📋 Problem Statement

As the cardiovascular risk is increasing. No of patient will also increase but the no of doctors avlaible is not enough to scrutinize every report. Which is a huge problem.

📋 Business Objective

We'll make a robust classification model which will classify whether or not a person will have cardiovascular risk within 10 years.

They just have to fill up required information on which model will predict whether or not a person will have cardiovascular risk within 10 years. It will increse the speed of identifying that person and reduce the cost, which will solve the problem of shortage of doctors and identifying that person at the right time and in a cost effective way.

📋 Visualization

1) Distribution of target features value

target

2) Somkers & Non Smokers in Male & Female

sns

3) NO_of cigarette's per day

nc

4) Distribution of people will have risk if they smoke or not

10y

5) Age Cardiovascular risk relation

agerisk

6) Distribution of glucose level

disf

After log transformation afterlog

Every feature is right skew except age, so do log tranformation on every feature except age.

7) Correlation Chart

cor

📋 Model Performance

Within 10 years who will have cardiovascular risk is denoted by 1. In this project detecting 1 is very important. So we'll only focus on the mertics of 1

mp

IN this project we are taking Recall as final metric b'coz classifying a person whether or not have cardio vascular risk within 10 years, recall is important.

KNN is the final model we've selected for this project b'coz its recall value is the highest.

📋 Conclusion

In this project we found out that age is the biggest factor of developing cardio vascular risk, and all other factors may increase the cardio vascular risk at that age. In this dataset male who smokes are more tha the man who do not smoke and their are less female who smokes than the female who do not smokes. According to overall data male smoke's more.

About

We'll make a robust classification model which will classify whether or not a person will have cardiovascular risk within 10 years.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published