Skip to content

CH-Earth/gistool

Repository files navigation

Description

This repository contains scripts to process necessary geospatial datasets and implement efficient zonal statistics on given ESRI Shapefiles. The general usage of the script (i.e., ./extract-gis.sh) is as follows:

Usage:
  extract-gis [options...]

Script options:
  -d, --dataset                    Geospatial dataset of interest
  -i, --dataset-dir=DIR            The source path of the dataset file(s)
  -r, --crs=INT                    The EPSG code of interest; optional
                                   [defaults to 4326]
  -v, --variable=var1[,var2[...]]  If applicable, variables to process
  -o, --output-dir=DIR             Writes processed files to DIR
  -s, --start-date=DATE            If applicable, start date of the geospatial
                                   data; optional
  -e, --end-date=DATE              If applicable, end date of the geospatial
                                   data; optional
  -l, --lat-lims=REAL,REAL         Latitude's upper and lower bounds; optional
  -n, --lon-lims=REAL,REAL         Longitude's upper and lower bounds; optional
  -f, --shape-file=PATH            Path to the ESRI '.shp' file; optional
  -F, --fid=STR                    Column name representing elements of the
                                   Polygons to report statistics; optional
                                   [defaults to the first column]
  -j, --submit-job                 Submit the data extraction process as a job
                                   to HPC's scheduler; optional
  -t, --print-geotiff=BOOL         Extract the subsetted GeoTIFF file; optional
                                   [defaults to 'true']
  -a, --stat=stat1[,stat2[...]]		 If applicable, extract the statistics of
                                   interest, currently available options are:
                                   'min';'max';'mean';'majority';'minority';
                                   'median';'quantile';'variety';'variance';
                                   'stdev';'coefficient_of_variation';'frac';
                                   'coords'; 'count'; 'sum'; optional
  -U, --include-na                 Include NA values in generated
                                   statistics, optional 
  -q, --quantile=q1[,q2[...]]      Quantiles of interest to be produced if 'quantile'
                                   is included in the '--stat' argument. The values
                                   must be comma delimited float numbers between
                                   0 and 1; optional [defaults to every 5th quantile]
  -p, --prefix=STR                 Prefix  prepended to the output files
  -b, --parsable                   Parsable SLURM message mainly used
                                   for chained job submissions
  -c, --cache=DIR                  Path of the cache directory; optional
  -E, --email=STR                  E-mail when job starts, ends, and fails; optional
  -C, --cluster=JSON                JSON file detailing cluster-specific details
  -V, --version                    Show version
  -h, --help                       Show this screen and exit

For bug reports, questions, and discussions open an issue
at https://github.com/CH-Earth/gistool/issues

Available Datasets

# Dataset Time Scale CRS DOI Description
1 MODIS 2000 - 2021 10.5067/MODIS/MCD12Q1.006 link
2 MERIT Hydro Not Applicable (N/A) 4326 10.1029/2019WR024873 link
3 Soil Grids (v1) Not Applicable (N/A) 4326 10.1371/journal.pone.0169748 link
4 Landsat NALCMS 2010 and 2015 4326 10.3390/rs9111098 link
5 Global Depth to Bedrock Not Applicable (N/A) 10.1002/2016MS000686 link
6 USDA Soil Class Not Applicable (N/A) 4326 10.4211/hs.1361509511e44adfba814f6950c6e742 link
7 Global Soil Dataset for Earth System Modelling (GSDE) Not Applicable (N/A) 4326 10.1002/2013MS000293 link

General Example

As an example, follow the code block below. Please remember that you MUST have access to Graham cluster with Digital Alliance of Canada. Also, remember to generate a Personal Access Token with GitHub in advance. Enter the following codes in your Graham shell as a test case:

foo@bar:~$ git clone https://github.com/kasra-keshavarz/gistool.git # clone the repository
foo@bar:~$ cd ./gistool/ # always move to the repository's directory
foo@bar:~$ ./extract-gis.sh -h # view the usage message
foo@bar:~$ ./extract-gis.sh \
  --dataset="merit-hydro" \
  --dataset-dir="/project/rpp-kshook/CompHydCore/merit_hydro/raw_data/" \
  --output-dir="$HOME/scratch/merit-hydro-test" \
  --lat-lims="45,47" \
  --lon-lims="-120,-117" \
  --print-geotiff=true \
  --variable="elv,hnd" \
  --prefix="merit_test_";

See the example directory for real-world scripts for each geospatial dataset included in this repository.

Logs

The datasets logs are generated under the $HOME/.gistool directory, only in cases where jobs are submitted to clusters' schedulers. If processing is not submitted as a job, then the logs are printed on screen.

--lib-path options

Currently, on Graham HPC, the following options are available:

/work/comphyd_lab/envs/r-env/ # UCalgary's ARC cluster

New Datasets

If you are considering any new dataset to be added to the data repository, and subsequently the associated scripts added here, you can open a new ticket on the Issues tab of the current repository. Or, you can make a Pull Request on this repository with your own script.

Support

Please open a new ticket on the Issues tab of the current repository in case of any problem.

License

Geospatial Dataset Processing Workflow
Copyright (C) 2022-2023, University of Saskatchewan
Copyright (C) 2023-2025, University of Calgary
Copyright (C) 2022-2025, gistool developers

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

About

gistool: Static Data Processing tool for HPCs

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •