Planet Snowcover is a project that pairs airborne lidar and Planet Labs satellite imagery with cutting-edge computer vision techniques to identify snow-covered area at unprecedented spatial and temporal resolutions.
💡This work was presented by Tony (@acannistra) at AGU 2019 in San Francisco. See the slides here.
Researchers: Tony Cannistra1, Dr. David Shean2, and Dr. Nicoleta Cristea2
2: Department of Civil and Environmental Engineering, University of Washington, Seattle, WA
This repository serves as the canonical source for the software and infrastructure necessary to sucessfully build and deploy a machine-learning based snow classifier using Planet Labs imagery and airborne lidar data.
- Primary Components
- Requirements
- Infrastructure Deployment
- Tutorials
- Implementation Details
- AWS Cloud Resources
- Open Source Machine Learning
- Funding Sources
- Original Research Proposal
The contents of this repository are divided into several main components, which we detail here. This is the place to look if you're looking for something in particular.
Folder | Description | Details |
---|---|---|
./pipeline |
Jupyter notebooks detailing the entire data processing, machine learning, and evaluation pipeline. | These notebooks detail every step in this workflow, from start to finish. |
./preprocess |
A set of Python CLI tools for preprocessing data assets. | These tools help to reproject and threshold the ASO raster files, create vector footprints of raster data, tile the imagery for training, and other related tasks. |
./model |
The implementation of the machine learning/computer vision techniques used by this project. | This work relies heavily on the robosat.pink repository, which we've forked and modified extensively. |
./sagemaker |
The infrastructure required to use Amazon Sagemaker to manage our ML training jobs. | Sagemaker requires considerable configuration, including a Docker container. We build this container from this directory, which has a copy of the ./model directory. |
./experiments |
Configuration files that describe experiments used to assess the performance of this ML-based snow cover method. | Our ML infrastructure uses "config files" to describe the inputs and other parameters to train the model effectively. We use these files to describe experiments that we perform, using different sets of ASO and imagery. |
./implementation-notes |
Technical descriptions of the implementation considerations that went into this project. | These are working documents, in raw Markdown format. |
./raster_utils |
Small utility functions for managing raster computations. | Not much to see here. |
./environment |
Raw Python environment configuration files. | conda and change often. Use sparingly. We preserve our environment via Docker, which should be used in this case (see the ./sagemaker directory) |
./analysis |
Jupyter notebooks that describe analyses about our snow mask product. |
The goal of this work is to provide a toolkit that is relatively easy to deploy for someone with working knowledge of the following tools:
- Python 3
- Jupyter notebooks
- Basic command-line tools
More specific requirements can be found in the Infrastructure Deployment section below.
This free, open-source software depends on a good number of other free, open-source software packages that permit this work. To understand the inner workings of this project, you'll need familiarity with the following:
- PyTorch
- Tensorflow
- scikit-image
- boto3 / s3fs
- Geopandas / shapely
- Rasterio / rio-tiler
- mercantile / supermercado
- Amazon Sagemaker
To build and manage our infrastructure, we use Docker and Terraform.
This project relies on cloud infrastructure from Amazon Web Services, which is a cloud services provider run by Amazon. AWS isn't the only provider in this space, but is the one we chose due to a combination of funding resources and familiarity. To run these tutorials and perform development tasks with this software, you'll need an AWS account. You can get one here.
In order to access the imagery data from Planet Labs used to train our computer vision models and assess their performance, we rely on a relationship with collaborator Dr. David Shean in UW Civil and Environmental Engineering, who has access to Planet Labs data through a NASA Terrestrial Hydrology Program award.
If you're interested in getting access to Planet Labs imagery for research, check out the Planet Education and Research Program.
Finally, to gain access to the NASA/JPL Airborne Snow Observatory lidar-derived snow depth information, you need an account with NASA Earthdata. Sign up here.
To explore this work, and the tutorials herein, you'll need to deploy some cloud infrastructure to do so. This project uses Dockerand Terraform to manage and deploy consistent, reliable cloud infrastructure.
For detailed instructions on this process, view the documentation.
To jump right to the guts of the deployment, here's our Dockerfile and Terraform Resource Definition.
Through support from Earth Science Information Partners, we're happy to be able to provide thorough interactive tutorials for these tools and methods in the form of Jupyter notebooks. You can see these tutorials in the data pipeline folder ./pipeline
.
This work wouldn't be possible without the advice and support of Dr. Nicoleta Cristea, Dr. David Shean, Shashank Buhshan, and others.
We gratefully acknowledge financial support from the Earth Science Innovation Partners Lab, the NASA Terrestrial Hydrology Program, the Planet Labs Education and Research Program, and the National Science Foundation.
To see the original resarch proposal for this project, now of date, view it here.