Enhancing Clinical Guidelines through the Integration of SKLearn Decision Trees for the Diagnosis of Diabetes Mellitus Type 2 🩺

Project Artificial Intelligence in Health @ Vrije Universiteit Amsterdam 🎓

A joint work of Kirandeep Gill and Mark Bartos made for the Year 2 class of Project Artificial Intelligence in Health.

About the Research Paper 📑

Abstract This research explores using decision tree learning and clinical guidelines to optimize the diagnosis of Diabetes Mellitus Type 2 (DM-2) in adults. Two datasets from India and Germany were used, focusing on lifestyle and clinical measurements. SKLearn's Decision Tree Classifier with K-Fold Cross Validation was employed.

Results Decision Tree 1 identified regular medicine and high blood pressure as significant factors for DM-2, with a slight contribution from family history. Decision Tree 2 found glucose levels and BMI most influential. Clinical guidelines missed sleep and prediabetes in the analysis.

Conclusion Decision trees and clinical guidelines aligned on BMI, glucose, insulin, pregnancies, and blood pressure. However, they differed on sleep, prediabetes, and overlooked symptoms like thirst and weight loss.

About the Code-Base and the Decision Trees 🧑‍💻

📝 The code-base and research paper is made avaliable for grading purposes.

This repository uses the SKLearn library to perform a Decision Tree Classifier for the diagnosis of Diabetes Mellitus Type 2 (DM-2). Here's a brief overview of the steps:

Import the necessary libraries (Pandas, NumPy, SKLearn).
Load the dataset from 'dataset1_imp.csv' and 'dataset2_imp.csv' using Pandas.
Separate numerical and categorical columns for further processing.
Apply One-Hot Encoding to the categorical columns.
Combine the numerical and encoded categorical columns to create the processed DataFrame.
Prepare for K-Fold Cross Validation with 5 and 10 folds.
Train the Decision Tree Classifier with the specified maximum depth and criterion.
Evaluate the model using accuracy as the metric for each fold.
Visualize the decision tree using Graphviz.

💡 A more detailed description over the modelling choices are avaliable in the Research Paper.

Visualized Results 📊

Decision Tree 1, based on Dataset 1 which focuses more on lifestyle choices

Decision Tree 2, based on Dataset 2 which focuses more on clinical measurements

Acknowledgments

We would like to express our gratitude to the following sources for providing the datasets used in this project:

Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215, for making their dataset available for research and non-commercial purposes. For more information and proper citation of this dataset, please refer to the following publication: Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167, 706-716. DOI: https://doi.org/10.1016/j.procs.2020.03.336.
Daanouni, O., Cherradi, B., & Tmiri, A. (2019, October). Predicting diabetes diseases using mixed data and supervised machine learning algorithms. In Proceedings of the 4th International Conference on Smart City Applications (pp. 1-6).

We acknowledge the hard work and valuable contributions of the authors in collecting and preparing these datasets, which significantly contributed to the success of this project.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Databases		Databases
Trees		Trees
Project AI in Health P1 - Group 6 Kirandeep Gill (2695941) and Márk Bartos (2724195).pdf		Project AI in Health P1 - Group 6 Kirandeep Gill (2695941) and Márk Bartos (2724195).pdf
README.md		README.md
Tree1.ipynb		Tree1.ipynb
Tree2.ipynb		Tree2.ipynb
dataset1_imp.csv		dataset1_imp.csv
dataset2_imp.csv		dataset2_imp.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Clinical Guidelines through the Integration of SKLearn Decision Trees for the Diagnosis of Diabetes Mellitus Type 2 🩺

Project Artificial Intelligence in Health @ Vrije Universiteit Amsterdam 🎓

About the Research Paper 📑

About the Code-Base and the Decision Trees 🧑‍💻

Visualized Results 📊

Acknowledgments

About

Languages

mrkbrts/vu-paih-group6

Folders and files

Latest commit

History

Repository files navigation

Enhancing Clinical Guidelines through the Integration of SKLearn Decision Trees for the Diagnosis of Diabetes Mellitus Type 2 🩺

Project Artificial Intelligence in Health @ Vrije Universiteit Amsterdam 🎓

About the Research Paper 📑

About the Code-Base and the Decision Trees 🧑‍💻

Visualized Results 📊

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages