Skip to content

A clear and interpretable baseline for predicting drug categories using patient features and a Decision Tree classifier. Designed to be interview-friendly, with emphasis on clarity, step-by-step decisions, and interpretability

Notifications You must be signed in to change notification settings

Shamir-Havas/Drug-Prediction-Decision-Tree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ’Š Drug Prediction β€” Decision Tree

A clear and interpretable baseline for predicting drug categories using patient features and a Decision Tree classifier. Designed to be interview-friendly, with emphasis on clarity, step-by-step decisions, and interpretability.


πŸ“‚ Project Structure

β”œβ”€β”€ Drug_Prediction_DecisionTree_polished.ipynb
β”œβ”€β”€ README.md # Project documentation


βš™οΈ Skills & Tech

  • Python, Jupyter Notebook
  • pandas, NumPy β€” Data handling
  • matplotlib, seaborn β€” Visualization
  • scikit-learn β€” DecisionTreeClassifier, model evaluation
  • EDA, Preprocessing, Model interpretation

πŸ“ Project Overview

This notebook demonstrates a complete Machine Learning workflow for predicting drug categories:

  1. Exploratory Data Analysis (EDA) – Inspect dataset distribution and patterns
  2. Preprocessing – Encoding categorical features, handling data types
  3. Model Training – Decision Tree Classifier
  4. Evaluation – Accuracy, interpretability, decision paths

πŸ“Š Dataset

  • Features:
    • Age: Age of the patient
    • Sex: Male/Female
    • Blood Pressure: Low / Normal / High
    • Cholesterol: Normal / High
    • Na_to_K ratio: Sodium-to-Potassium ratio in the blood
  • Target: Drug type (DrugA, DrugB, DrugC, DrugX, DrugY)
  • Size: 200 samples
  • Source: UCI / educational dataset

▢️ How to Run

  1. Clone or download this repository:
    git clone https://github.com/Shamir-Havas/Drug-Prediction-Decision-Tree.git
    cd Drug-Prediction-Decision-Tree

Install dependencies:

bash Copy code pip install -r requirements.txt Open Jupyter Notebook and run the workflow:

bash Copy code jupyter notebook Drug_Prediction_DecisionTree_polished.ipynb Run all cells:

Kernel β†’ Restart & Run All

πŸ“Š Results

πŸ”Ή Category Counts

Category Counts

πŸ”Ή Decision Tree Visualization

Decision Tree

πŸ”Ή Model Accuracy

Model Accuracy

πŸ” Model Explainability

Decision Tree Visualization: Interpretable decision paths using plot_tree

Classification Report: Precision, recall, F1-score

πŸš€ Future Improvements

Hyperparameter tuning with GridSearchCV / RandomizedSearchCV

Cross-validation (e.g., Stratified K-Fold) for robustness

Try ensemble methods (Random Forest, XGBoost)

Domain-specific validation & feature engineering

πŸ“¦ Requirements

pandas==2.0.3
numpy==1.25.2
matplotlib==3.7.2
seaborn==0.12.2
scikit-learn==1.3.0

About

A clear and interpretable baseline for predicting drug categories using patient features and a Decision Tree classifier. Designed to be interview-friendly, with emphasis on clarity, step-by-step decisions, and interpretability

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published