EV_Vehicle_Prediction

This project provides a comprehensive pipeline to forecast the total number of Electric Vehicles (EVs) in each county over time using historical data. The workflow includes data preprocessing, feature engineering, model training with hyperparameter tuning, and model evaluation.

🔗 Live Demo: Streamlit App – EV Vehicle Prediction

📊 Project Overview

Using the Electric_Vehicle_Population_By_County.csv dataset, this project:

Cleans and processes time-series EV data at the county level.
Engineers features to capture temporal trends and growth patterns.
Trains a RandomForestRegressor to predict EV totals.
Evaluates model performance and visualizes feature importance.
Forecasts EV adoption for any given county over the next 3 years.
Saves and tests the trained model with joblib.
Deploys an interactive forecasting tool via Streamlit (app.py).

🚀 Interactive Forecasting App

You can interact with the model using the built-in Streamlit dashboard.

🔧 To run the app locally:

streamlit run app.py

📊 Project Highlights

✅ Data Preprocessing

Handled missing values in County and State
Converted vehicle count columns to numeric
Capped outliers in Percent Electric Vehicles
Converted Date column to datetime

🔍 Exploratory Data Analysis

Identified top/bottom counties by EV adoption
Visualized stacked vehicle distributions
Calculated total counts for BEVs, PHEVs, EVs, and Non-EVs

🛠️ Feature Engineering

Lag features (1 to 3 months)
Rolling 3-month EV average
Percent change over 1 and 3 months
Cumulative EVs per county
6-month rolling slope for growth trend

🧠 Model Training

Model: RandomForestRegressor
Hyperparameter Tuning: RandomizedSearchCV (30 iterations, 3-fold CV)

Best Parameters:

{
    'n_estimators': 200,
    'min_samples_split': 4,
    'min_samples_leaf': 1,
    'max_features': None,
    'max_depth': 15
}

📁 Dataset

The dataset should include columns such as:

Date
County, State
Electric Vehicle (EV) Total
Battery Electric Vehicles (BEVs), Plug-In Hybrid Electric Vehicles (PHEVs)
Non-Electric Vehicle Total, Total Vehicles
Percent Electric Vehicles

📈 Model and Evaluation

Model: RandomForestRegressor
Tuning: RandomizedSearchCV with cross-validation

📌 Results

✅ Evaluation Metrics

Metric	Value
MAE	132.76
RMSE	200.45
R² Score	0.89

These results indicate a strong model performance with relatively low error compared to the scale of EV counts.

🧾 Sample Output

        Date         County       Predicted_EV_Total
0   2025-08-01      Kings             14527
1   2025-09-01      Kings             14862
2   2025-10-01      Kings             15230
3   2025-11-01      Kings             15575
4   2025-12-01      Kings             15940
5   2026-01-01      Kings             16294

💾 Model Persistence

Model saved to: forecasting_ev_model.pkl.
Successfully reloaded and tested.

To avoid retraining:

from joblib import load
model = load('forecasting_ev_model.pkl')

🔍 Single Sample Test

Actual EVs:    1025.00  
Predicted EVs: 998.23

🔮 Forecasting

📍 County-Level Forecasting

Forecasts next 36 months of EV growth for a selected county (e.g., Kings).

Includes:

Monthly predicted EV counts
Cumulative EV count trendline
Comparison between historical and forecasted values

🌍 Top-5 Counties Forecast

Forecasted next 3 years for the top 5 counties (based on cumulative EV adoption)
Combined historical and future trendlines
Visual comparison of growth rates across counties

🌍 Multi-County Comparison

The Streamlit app supports:

Selecting up to 3 counties
Side-by-side EV growth comparison
Growth % summaries

📊 Visualizations

🔹 EV Breakdown vs Total Vehicles

Stacked column chart comparing:

BEV (Battery Electric Vehicles)
PHEV (Plug-in Hybrids)
EV (total)
Non-EVs

It highlights the share of EVs in the overall vehicle population.

🔹 Actual vs Predicted EV Count

Line plot showing the RandomForest model's predictions vs actual EV counts across sample indices.
Close overlap indicates strong model accuracy.

🔹 Feature Importance

Bar plot displaying the importance scores of engineered features like:

Lag values
Rolling averages
Percent changes

Used to assess the model's key drivers of prediction.

🔹 County-Level Forecast: Kings County (Monthly)

Historical vs 36-month forecast for Kings County showing monthly EV growth trends.

🔹 Cumulative EV Forecast: Kings County

Chart showing cumulative EV adoption over time, including projected growth for the next 3 years.

🔹 Top 5 Counties Forecast

Visualization of historical and projected cumulative EV growth for the top 5 counties:

Fairfax
Honolulu
Los Angeles
Orange
Santa Clara

📊 Dashboard Insights & Visualizations

The interactive dashboard provides actionable insights into EV adoption trends through dynamic visualizations and comparative analysis. Below are key components demonstrated through the app's outputs:

🔍 Single-County Deep Dive

Features:

County Selection: Analyze specific counties (e.g., Ada) with adjustable forecast horizons (12–60 months).

Model Metrics:

MAE: 0.1
RMSE: 0.3
MAPE: 4.8%

Advanced Options:

Seasonality analysis
Monthly breakdowns
Historical vs. forecasted comparisons

Example Insights:

Ada County shows a projected increase from 1.5 to 2.0 EVs/month (31.1% growth rate).

📈 Trend Visualizations

Monthly Adoption Forecast

Tracks granular monthly EV counts (e.g., 1.2 to 2.0 EVs/month in Ada).

Cumulative Adoption Projection

Visualizes long-term EV accumulation (e.g., ~150 EVs by 2027 in Ada).

Forecast Data Preview

Tabular preview of forecasted values (e.g., consistent 2 EVs/month for Ada in 2026–2027).

↔️ Multi-County Benchmarking

Features:

Compare up to 3 counties (e.g., Ada vs. Alameda).
Metrics: Cumulative counts, monthly adoption rates, or growth percentages.

Key Results

Side-by-side historical and forecasted trends.

Highlighted Metrics:

Autauga: 104.7% growth (1.9 → 3.9 EVs/month)
Alameda: 7.3% growth despite a -52.3% cumulative decline

📉 Growth Rate Breakdown

Bar charts comparing county-level growth percentages.

Tabulated Summaries:

County	Historical EVs	Forecasted EVs	Growth Rate
Ada	90	72	31.1%
Alameda	302	144	7.3%

📁 Files

File	Description
`Electric_Vehicle_Population_By_County.csv`	Raw EV dataset
`preprocessed_ev_data.csv`	Cleaned and feature-engineered data
`forecasting_ev_model.pkl`	Trained RandomForest regression model
`ev_forecasting.ipynb`	Full pipeline notebook with forecasting
`README.md`	Project overview and instructions

🛠 Setup Instructions

Clone the Repository

git clone https://github.com/your-username/ev-forecasting.git
cd ev-forecasting

Install Dependencies

Make sure Python ≥ 3.7 is installed, then install required packages:
```
pip install pandas numpy matplotlib seaborn scikit-learn joblib
```
Run the Script

Ensure the dataset Electric_Vehicle_Population_By_County.csv is in the working directory and run:
```
jupyter notebook ev_forecasting.ipynb
```
Run the Streamlit App

To launch the interactive forecaster:
```
streamlit run app.py
```

🆚 Notebook vs App: When to Use What?

Tool	Purpose
`ev_forecasting.ipynb`	Explore full data pipeline, modeling, and evaluation
`app.py`	Interactive forecasting tool for end-users

🧠 Future Improvements

Integrate demographic data like population, income, or GDP by county.
Try gradient boosting models like XGBoost or LightGBM.
Explore deep learning with LSTM for sequential forecasting.
Deploy via Docker or to Streamlit Cloud for public access.

📃 License

This project is open-source and licensed under the MIT License.

🙌 Credits

Prepared for the AICTE Internship Cycle 2 by S4F

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.devcontainer		.devcontainer
EV_Prediction.ipynb		EV_Prediction.ipynb
Electric_Vehicle_Population_By_County.csv		Electric_Vehicle_Population_By_County.csv
LICENSE		LICENSE
README.md		README.md
app.py		app.py
forecasting_ev_model.pkl		forecasting_ev_model.pkl
preprocessed_ev_data.csv		preprocessed_ev_data.csv
requirements.txt		requirements.txt
runtime.txt		runtime.txt

License

ayus1234/EV_Vehicle_Prediction

Folders and files

Latest commit

History

Repository files navigation