Skip to content

jameslowman/clay_model

 
 

Repository files navigation

Clay Foundation Model

Jupyter Book Badge Deploy Book Status Continuous Integration Tests Status

An open source AI model and interface for Earth.

Getting started

Quickstart

Launch into a JupyterLab environment on

Binder Planetary Computer SageMaker Studio Lab
Binder Open on Planetary Computer Open in SageMaker Studio Lab

Installation

Basic

To help out with development, start by cloning this repo-url

git clone <repo-url>

Then we recommend using mamba to install the dependencies. A virtual environment will also be created with Python and JupyterLab installed.

cd model
mamba env create --file environment.yml

Activate the virtual environment first.

mamba activate claymodel

Finally, double-check that the libraries have been installed.

mamba list

Advanced

This is for those who want full reproducibility of the virtual environment. Create a virtual environment with just Python and conda-lock installed first.

mamba create --name claymodel python=3.11 conda-lock=2.5.1
mamba activate claymodel

Generate a unified conda-lock.yml file based on the dependency specification in environment.yml. Use only when creating a new conda-lock.yml file or refreshing an existing one.

conda-lock lock --mamba --file environment.yml --platform linux-64 --with-cuda=12.0

Installing/Updating a virtual environment from a lockile. Use this to sync your dependencies to the exact versions in the conda-lock.yml file.

conda-lock install --mamba --name claymodel conda-lock.yml

See also https://conda.github.io/conda-lock/output/#unified-lockfile for more usage details.

Usage

Running jupyter lab

mamba activate claymodel
python -m ipykernel install --user --name claymodel  # to install virtual env properly
jupyter kernelspec list --json                       # see if kernel is installed
jupyter lab &

Running the model

The neural network model can be ran via LightningCLI v2. To check out the different options available, and look at the hyperparameter configurations, run:

python trainer.py --help
python trainer.py test --print_config

To quickly test the model on one batch in the validation set:

python trainer.py validate --trainer.fast_dev_run=True

To train the model for a hundred epochs:

python trainer.py fit --trainer.max_epochs=100

To generate embeddings from the pretrained model's encoder on 1024 images (stored as a GeoParquet file with spatiotemporal metadata):

python trainer.py predict --ckpt_path=checkpoints/last.ckpt \
                          --data.batch_size=1024 \
                          --data.data_dir=s3://clay-tiles-02 \
                          --trainer.limit_predict_batches=1

More options can be found using python trainer.py fit --help, or at the LightningCLI docs.

About

The Clay Foundation Model (in development)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.1%
  • Shell 4.6%
  • Dockerfile 0.3%