Skip to content

eerichmond/ml-wildfire-prediction

Repository files navigation

Predict California Wildfires from Weather and Soil Conditions

Build coverage badge

One topic I am passionate about is the environment, especially the impact that climate change has on our natural world and standard of living. To get an idea of what kind of climate related datasets were out there, I scrubbed Kaggle.com for high quality datasets that involved the environment. A couple datasets caught my attention because they were so close to home. The two datasets were United States wildfires over a 24 year period and United States droughts and soil conditions over a 20 year period. I live in the central valley of California (US) where every year the fires in the hills on either side of the valley become worse and worse, creating horrible air quality and destroying the homes and forests. I am interested in predicting when and where wildfires will occur next. Identifying these locations could lead to better fire preparation and population planning.

Demo

Getting Started

  • Install Anaconda
  • conda create -name wildfire python=3.9
  • conda activate wildfire
  • brew install cmake
  • pip install -r requirements.txt
  • Install gcloud
  • gcloud auth application-default login
  • yarn --cwd ./app/ build

Run Locally

  • uvicorn app.main:app --reload

Run Tests

  • coverage run --source=./app/ -m pytest -v && coverage report
  • Watch tests ptw --runner pytest
  • Generate coverage badge coverage-badge -f -o coverage.svg

How to Train

  • Download fires.sqlite from Google Cloud Storage (19GB) to ./data/fires.sqlite
  • conda activate ml-wildfire
  • python -m app.trainer.export to generate X_train.npy, X_test.npy, y_train.npy, y_test.npy, scalar.pickle numpy array binaries. This is a separate steps because it takes 3+ hours to turn the ~27 million geolocated weather points into a 13GB X_train.npy
  • python -m app.trainer.train xgb to generate the app/models/xgb_model.pickle

Google Cloud Run Setup (onetime)

  • Google Cloud Dashboard
  • Edit gcp_setup.sh and build.yml
    • Replace the Google account number (644348144159) and project ID (strong-maker-345805) with your own.
    • Update the docker registry locations ghcr.io/eerichmond/ml-wildfire-prediction:latest and us-west1-docker.pkg.dev/strong-maker-345805/ml-wildfire/ml-wildfire:latest with your own.
  • Run sh ./gcp_setup.sh to create the ml-wildfire Google Cloud Run service and Google Artifact Registry

Deployment

Deployment Diagrams

On every git push, GitHub Actions build.yml will:

  • Install and test the Python app
  • Build and push the Docker image to GitHub Container Registry and Google Artifact Registry
  • Deploy the :latest Docker image to Google Cloud Run

Datasets

  • Newest version of the data (up to 2018) at US Forest Service
  • License CC0: Public Domain

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published