As the cardio vascular risk is increase year by year. from the given data we've to predict whether or not a person will have cardio vascular risk within 10 years.
We are gonna do some eda to visaulize the dataset and will fnd some insight from eda, check for missing value, distribution of the features, target feature is balanced or not. Capping outliers, Balacing the target feature by SMOTE
After EDA, feature engineering and feature selection we'll split the dataset into training and testing set. Then we'll train multiple classification models, Hyperparameter tunning, evaluating model on the basis of classification metrics.
Finalizing the metric and best model on the basis of selected metric.
As the cardiovascular risk is increasing. No of patient will also increase but the no of doctors avlaible is not enough to scrutinize every report. Which is a huge problem.
We'll make a robust classification model which will classify whether or not a person will have cardiovascular risk within 10 years.
They just have to fill up required information on which model will predict whether or not a person will have cardiovascular risk within 10 years. It will increse the speed of identifying that person and reduce the cost, which will solve the problem of shortage of doctors and identifying that person at the right time and in a cost effective way.
Every feature is right skew except age, so do log tranformation on every feature except age.
Within 10 years who will have cardiovascular risk is denoted by 1. In this project detecting 1 is very important. So we'll only focus on the mertics of 1
IN this project we are taking Recall as final metric b'coz classifying a person whether or not have cardio vascular risk within 10 years, recall is important.
KNN is the final model we've selected for this project b'coz its recall value is the highest.
In this project we found out that age is the biggest factor of developing cardio vascular risk, and all other factors may increase the cardio vascular risk at that age. In this dataset male who smokes are more tha the man who do not smoke and their are less female who smokes than the female who do not smokes. According to overall data male smoke's more.