Skip to content

Machine Learning project predicting calories burned using regression and classification models (Gradient Boosting, Logistic Regression, Neural Network).

Notifications You must be signed in to change notification settings

9eek9/Predicting_Calories_Burned

Repository files navigation

🏋️‍♀️ Predicting Calories Burned Using Machine Learning

This project focuses on predicting the number of calories burned during workouts using various Machine Learning and Deep Learning models.
It involves both Regression (continuous calorie prediction) and Classification (categorizing calorie burn level) tasks.

Developed as part of the Fanshawe College AI & ML coursework, this project demonstrates practical model comparison, evaluation, and interpretability for real-world fitness analytics.


🎯 Objective

The main goal is to predict the calories burned based on biometric and activity features such as age, gender, height, weight, workout type, and heart rate.
Two complementary approaches were implemented:

  1. Regression: Predict actual calories burned (continuous value).
  2. Classification: Predict calorie burn category (High / Low).

📊 Dataset Overview

  • Source: Kaggle – Calories Burned Prediction Dataset
  • Rows: 1,500+
  • Features:
Category Features
Demographics Age, Gender, Height, Weight, BMI
Workout Stats Max_BPM, Avg_BPM, Resting_BPM, Session_Duration
Lifestyle Workout_Type, Workout_Frequency, Experience_Level, Water_Intake
Target Calories_Burned

⚙️ Data Preprocessing

  • Label encoding for categorical variables (Gender, Workout_Type, etc.)
  • Standard scaling of numeric features for model consistency
  • Derived new feature BMI = weight / height²
  • Removed outliers using IQR filtering
  • Split dataset → 80% Train / 20% Test

🤖 Models Implemented

🔹 Classification Models

Model Accuracy F1-Score Observation
Logistic Regression 95.89% 0.959 Best performing model
Neural Network 94.35% 0.944 Strong deep learning alternative
SVC 93.84% 0.937 High precision
Gradient Boosting 92.82% 0.928 Balanced results
Decision Tree 90.25% 0.900 Slight overfitting

Logistic Regression gave the highest accuracy and generalization capability.


🔹 Regression Models

Model MSE MAE Observation
Gradient Boosting 851.25 20.06 Best regressor
Neural Network 1102.09 26.16 Competitive results
Linear Regression 1646.18 30.27 Baseline model
SVR 1692.80 29.82 Generalizes well
Decision Tree 4538.18 50.58 Overfitting observed

Gradient Boosting Regressor achieved the lowest error values.


📈 Evaluation Metrics

  • Classification: Accuracy, Precision, Recall, F1-score, Confusion Matrix
  • Regression: Mean Absolute Error (MAE), Mean Squared Error (MSE), R² Score
  • Cross-validation: 5-fold validation
  • Hyperparameter Tuning: RandomizedSearchCV for optimal parameters

🔍 Feature Importance

  • Session Duration and Avg BPM were the top two most influential factors.
  • Weight and Experience Level also contributed significantly.

Feature importance was visualized using Gradient Boosting feature importances and SHAP values.


💡 Key Insights

  • Simpler models (like Logistic Regression) can outperform complex ones with good preprocessing.
  • Neural networks add flexibility but require tuning and longer training time.
  • Heart rate and workout duration are strong predictors of calorie burn.

🧰 Tech Stack

  • Python 3.10+
  • Pandas, NumPy, Matplotlib, Seaborn
  • Scikit-learn
  • TensorFlow / Keras
  • XGBoost / GradientBoostingRegressor
  • Jupyter Notebook

⚙️ How to Run

# 1️⃣ Install dependencies
pip install -r requirements.txt

# 2️⃣ Open the Jupyter notebook
jupyter notebook Predicting_Calories_Burned.ipynb

📁 Project Structure

Predicting_Calories_Burned/
│
├── Predicting_Calories_Burned.ipynb   # Main notebook
├── requirements.txt
├── README.md

🧩 Results Summary

Task Best Model Metric Score
Classification Logistic Regression Accuracy 95.89%
Regression Gradient Boosting MSE 851.25

🚀 Applications

  • Fitness tracking systems and smartwatches
  • Personalized calorie estimation for users
  • AI fitness assistants and gym dashboards

👩‍💻 Author

Ei Ei Khaing
Graduate Certificate in Artificial Intelligence & Machine Learning
Fanshawe College | London, Ontario, Canada

🔗 LinkedIn
💻 GitHub


🏷️ Keywords

Machine Learning Regression Classification Gradient Boosting Logistic Regression Neural Network Calorie Prediction Fitness Analytics

About

Machine Learning project predicting calories burned using regression and classification models (Gradient Boosting, Logistic Regression, Neural Network).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published