This is a machine learning-powered web application that predicts CO₂ emissions of vehicles based on technical specifications. It also provides explainable AI features using SHAP and LIME to interpret predictions.
Dataset: MY1995-2023 Fuel Consumption Ratings
Contains fuel consumption data and CO₂ emissions for vehicles sold in Canada.
Features used:
- EngineSize_L
- Cylinders
- FuelConsCity_L100km
- FuelConsHwy_L100km
- Comb_L100km
- Comb_mpg
- FuelType
Target: CO2Emission_g_km
- Preprocessing using
StandardScalerandOneHotEncoder - Model:
RandomForestRegressorwith 100 estimators - Train-test split: 80/20
- Pipeline created using scikit-learn's
PipelineandColumnTransformer - Serialized using
pickletomodel_pipeline.pkl
- Interactive UI built with
Streamlit - Users can select vehicle specs (Make, Model, Engine Size, etc.)
- Predicts CO₂ emissions upon input
- Provides model explainability via SHAP and LIME
- Visuals generated using matplotlib
- SHAP: Shows global feature importance using Shapley values
- LIME: Explains individual predictions with a local surrogate model
- PDP vs ICE: PDP shows average effects, ICE reveals individual variation
The application is deployed on Render and is accessible through a live web URL.
app.py– Streamlit web applicationmodel_pipeline.pkl– Serialized ML pipelineMY1995-2023-Fuel-Consumption-Ratings.csv– DatasetX_test.csv– Sample test data
Step 1: Install requirements pip install streamlit pandas scikit-learn shap lime matplotlib
Step 2: Run the app streamlit run app.py
This project demonstrates how to combine machine learning with explainable AI to create transparent and user-friendly predictive systems. By using SHAP, LIME, PDP, and ICE, the app offers valuable insight into model behavior and fosters trust in predictions.