A PyTorch model designed to predict the risk of heart disease based on a combination of symptoms, lifestyle factors, and medical history from 70,000+ data samples. The model achieves approximately 99.27% accuracy on test data.
This project uses PyTorch to build a neural network classifier for heart disease risk prediction. The model analyzes several medical predictor variables to determine if a patient is at risk of heart disease.
The dataset contains medical predictor variables from the heart_disease_risk_dataset_earlymed.csv file.
It contains 18 medical predictors of heart disease:
- Chest Pain: Presence of chest pain (Yes/No)
- Shortness of Breath: Difficulty breathing (Yes/No)
- Fatigue: Feeling of tiredness (Yes/No)
- Palpitations: Irregular heartbeat sensations (Yes/No)
- Dizziness: Feeling lightheaded (Yes/No)
- Swelling: Edema in extremities (Yes/No)
- Pain Arms Jaw Back: Pain radiating to arms/jaw/back (Yes/No)
- Cold Sweats Nausea: Presence of cold sweats or nausea (Yes/No)
- High BP: High blood pressure diagnosis (Yes/No)
- High Cholesterol: High cholesterol diagnosis (Yes/No)
- Diabetes: Presence of diabetes (Yes/No)
- Smoking: Current smoking status (Yes/No)
- Obesity: Obesity status (Yes/No)
- Sedentary Lifestyle: Physical inactivity (Yes/No)
- Family History: Family history of heart disease (Yes/No)
- Chronic Stress: Ongoing stress condition (Yes/No)
- Gender: Patient's gender (Male/Female)
- Age: Age of patient in years
Output variable:
- Risk: Risk of Heart Disease (low/high)
- Python 3.8+
- PyTorch
- pandas
- scikit-learn
- matplotlib
- Clone the repository
- Create a virtual environment
py -m venv .venv
and activate it.venv/Scripts/activate
- Install dependencies:
pip install -r requirements.txt
- Run the model:
python main.py
- Input layer: 18 features
- Hidden layer 1: 64 neurons with ReLU activation
- Hidden layer 2: 28 neurons with ReLU activation
- Output layer: 2 neurons (Binary classification)
- Optimization: Adam optimizer with learning rate 0.005
- Loss function: Cross Entropy Loss
The model is trained for 1000 epochs and the training progress is visualized through a loss plot that is automatically generated and saved as 'loss_plot.png'. The model achieves approximately 99.27% accuracy on the test set, with results being reproducible using a fixed random seed (392).
The trained model is saved to 'heart_disease_classifier_model.pth' for later use.