The aim of this project is to create a deep learning model to predict whether a patient has heart disease or not.
The model is trained with the Heart Disease Dataset that was obtained from kaggle.
This project is created using Spyder and Visual Studio Code as the IDE for the python and jupyter notebook respectively. The packages used in this project are Pandas, Scikit-learn, TensorFlow Keras and Matplotlib.
The data is first loaded and preprocessed to properly split them into features and labels. Then the data is split into train and test sets, with a ratio of 80:20.
A feedforward neural network is constructed that is catered for classification problem. The structure of the model is fairly simple. Figure below shows the structure of the model:
The model is trained with a batch size of 16 and for 50 epochs. Early stopping and dropout is applied in this training to reduce overfitting. The training stops at epoch 40, with a training accuracy of 96% and validation accuracy of 94%. The results of the training process are shown in the graph visualized by Matplotlib below:
These model also recorded TensorBoard logs to observe whether the model overfits or underfits,
To open an embedded tensorboard viewer inside a notebook, copy the following into a code-cell:
%tensorboard --logdir images/tb_logs/heart_disease
The results recorded in the TensorBoard logs are shown in the images below:
Upon evaluating the model with test data, the model obtain the following test results, as shown in figure below:
Since both the train and test results have an accuracy above 90%, we can say that the model are accurate.