Skip to content

jieying-tech/Rain-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Rain Prediction

About The Project

Data source:
https://www.kaggle.com/jsphyg/weather-dataset-rattle-package
The dataset contains 10 years of daily weather observations from many locations across Australia.

This project was built with:

Project Summary

  1. Data Understanding
    • Descriptions of variables
  2. Exploratory Data Analysis
    • Exploring the categorical and numerial variables
    • Feature engineering of Date variable
    • Outlier detection using boxplots
    • Checking the distribution of numerical variables using histograms
    • Checking the distribution of target variable (class distribution)
    • Correlation analysis
    • Checking for duplicates
  3. Data Pre-processing
    • Handling missing values
    • Removing outliers
    • Categorical data encoding
    • Feature scaling
    • Feature selection
  4. Training Baseline Models
    • Created a function to evaluate performance of multiple models using multiple metrics through Cross Validation

    performance evaluation

  5. Shortlisting the Best Models
    • Selected the top 3 models
  6. Hyperparameter Tuning
    • Determined the best parameters of the models using Randomized Search Cross Validation
  7. Building Ensemble Models
    • Built 3 ensemble models using Stacking Classifier
  8. Model Evaluation
    • Evaluated the performance of the 3 inital shortlisted models and the 3 ensemble models
    • Plotted learning curves to compare the performance of the models on training and testing data
    • Determined the best model

Collaborators

  • Andy Chow Sai Kit
  • Wong Yew Lee
  • Li Chen Zhen

About

Building multiple machine learning models to predict next-day rain in Australia

Topics

Resources

Stars

Watchers

Forks