This repository contains the code for replicating the experiments from Sat-SINR: High-Resolution Species Distribution Models through Satellite Imagery. The work extends the SINR model with satellite imagery on the basis of the GeoLifeClef 2023 challenge. The code uses the Pytorch Lightning framework with Hydra.
To replicate the experiments, you will need to setup the Python environment, download the additional data, and pre-process two files. It is important that the paths in the config local.yaml point to the proper files.
Clone the repository:
git clone https://github.com/ecovision-uzh/sat-sinr.git
Create a new Python environment and install the requirements (in Unix):
python3 -m venv .sat-sinr-venv
source .sat-sinr-venv/bin/activate
pip install -r requirements.txt
For the experiments, you are required to download series of additional files.
Source: GeoLifeClef 2023 challenge, extracted from the Ecodatacube platform
Volume: ~5 million RGB & NIR Sentinel-2 images of size 128x128
Source: WorldClim
Volume: 19 global bioclimatic rasters & 1 elevation raster at a resolution of 30 arcseconds (~1km)
Source: GeoLifeClef 2023 challenge, extracted from GBIF
Volume: ~5 million occurrences in a .csv
Source: GeoLifeClef 2023 challenge, collected from multiple sources
Volume: ~5.000 surveys in a .csv
To replicate the experiments from the work, you need to pre-process the environmental and PO data.
The Jupyter notebook crop and scale bioclim loads the 20 TIFF-rasters and turns them into a single numpy array, normalizes it and crops it to the European bounds.
The Jupyter notebook reduce classes loads the 5 million PO occurences, reduces sample-number per class to 1000 and removes all classes with less than 10 samples.
Ensure that the paths in the config local.yaml point to the proper files.
Train late fusion Sat-SINR with location, bioclimatic and Sentinel-2 images as predictor:
python3 main.py model=sat_sinr_lf dataset.predictors=loc_env_sent2
Train SINR with location as predictor:
python3 main.py model=sinr dataset.predictors=loc
The model options are: "sat_sinr_ef", "sat_sinr_mf", "sat_sinr_lf", "sinr" and "log_reg".
The dataset.predictors options are any combination of "loc", "env" and "sent2".
The base_config.yaml contains the model and training parameters.
The dataset.yaml config contains dataset parameters.
The local.yaml config contains the paths pointing to the various files.
The paper was presented at ISPRS TCII Symposium in June 2024:
Dollinger, J., Brun, P., Sainte Fare Garnot, V., and Wegner, J. D.: Sat-SINR: High-Resolution Species Distribution Models Through Satellite Imagery, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., X-2-2024, 41–48, https://doi.org/10.5194/isprs-annals-X-2-2024-41-2024, 2024
@Article{SatSINR_ISPRS2024,
AUTHOR = {Dollinger, J. and Brun, P. and Garnot, V. S. F. and Wegner, J. D.},
TITLE = {High-Resolution Species Distribution Models through Satellite Imagery},
JOURNAL = {ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences},
VOLUME = {X-2-2024},
YEAR = {2024},
PAGES = {41-48},
URL = {https://doi.org/10.5194/isprs-annals-X-2-2024-41-2024},
DOI = {}
}