Skip to content

Iss52 dev #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Mar 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
ceca581
Moving the tool to a HPC-agnostic interface
kasra-keshavarz Feb 12, 2025
7ee2b80
ARC-specific modification as a proof of concept
kasra-keshavarz Feb 12, 2025
7edbeba
Bumping development version
kasra-keshavarz Feb 12, 2025
c0b3a15
ARC-specific modifications as a proof of concept
kasra-keshavarz Feb 12, 2025
5003f2b
Changing the URL of gistool
kasra-keshavarz Feb 12, 2025
4cd60fe
Changing URL
kasra-keshavarz Feb 12, 2025
1627ac8
Bumping version
kasra-keshavarz Feb 27, 2025
c75b582
Adding 7z reqs
kasra-keshavarz Feb 27, 2025
4de0cca
Including purrr in the renv
kasra-keshavarz Feb 27, 2025
f4b6fc7
Restructuring and preparation for HPC-independent workflow
kasra-keshavarz Feb 28, 2025
ec860f9
Modifications to template cluster-specific JSON files
kasra-keshavarz Feb 28, 2025
624b493
Updating template JSON files
kasra-keshavarz Feb 28, 2025
e5b4c82
Contributing
kasra-keshavarz Mar 3, 2025
2533f25
Updating README with new Usage
kasra-keshavarz Mar 3, 2025
ec89fb7
Backing up
kasra-keshavarz Mar 4, 2025
b254ba4
Backing up
kasra-keshavarz Mar 4, 2025
2b7f86d
Debugging landsat subsetting feature
kasra-keshavarz Mar 4, 2025
7cfcfaf
Assuring crs is provided
kasra-keshavarz Mar 5, 2025
12e59f7
Testing recipes
kasra-keshavarz Mar 5, 2025
5749e9a
Conforting to the new version
kasra-keshavarz Mar 5, 2025
caa5080
Copy of datatools docs for start
kasra-keshavarz Mar 5, 2025
03090fd
Updating readme
kasra-keshavarz Mar 5, 2025
80ba6fe
Replacing TABs with whitespace characters
kasra-keshavarz Mar 6, 2025
06b2a31
Assuring the cluster-specific files are up to date
kasra-keshavarz Mar 6, 2025
297ee5c
Updating graham's addresses
kasra-keshavarz Mar 6, 2025
60dd1f1
initial environment setup script
kasra-keshavarz Mar 6, 2025
191796f
Adding default values
kasra-keshavarz Mar 6, 2025
ef007e3
Adding help messages and options
kasra-keshavarz Mar 6, 2025
da1c078
Loading modules for HPCs
kasra-keshavarz Mar 7, 2025
17cd16a
Changing TABs with whitespaces
kasra-keshavarz Mar 7, 2025
c807d35
Updating gitignore not to include doc builds
kasra-keshavarz Mar 7, 2025
8600e34
Progress on docs
kasra-keshavarz Mar 7, 2025
06adbd3
Adding dependencies
kasra-keshavarz Mar 7, 2025
6b5058b
Turning the list of deps into a table
kasra-keshavarz Mar 7, 2025
78e2c22
Updating docs
kasra-keshavarz Mar 7, 2025
8767807
Update soil_class.sh
kasra-keshavarz Mar 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,9 @@
.ipynb_checkpoints
.DS_Store
*.swp

# test folders
ignore-tests/

# docs stuff
docs/build/
31 changes: 31 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Read the Docs configuration file for Sphinx projects
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.12"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py
# You can configure Sphinx to use a different builder, for instance use the dirhtml builder for simpler URLs
# builder: "dirhtml"
# Fail on all warnings to avoid broken references
# fail_on_warning: true

# Optionally build your docs in additional formats such as PDF and ePub
formats:
- pdf
- epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
121 changes: 4 additions & 117 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,124 +1,11 @@
# Description
This repository contains scripts to process necessary geospatial datasets
and implement efficient zonal statistics on given ESRI Shapefiles. The
general usage of the script (i.e., `./extract-gis.sh`) is as follows:

```console
Usage:
extract-gis [options...]

Script options:
-d, --dataset Geospatial dataset of interest
-i, --dataset-dir=DIR The source path of the dataset file(s)
-r, --crs=INT The EPSG code of interest; optional
[defaults to 4326]
-v, --variable=var1[,var2[...]] If applicable, variables to process
-o, --output-dir=DIR Writes processed files to DIR
-s, --start-date=DATE If applicable, start date of the geospatial
data; optional
-e, --end-date=DATE If applicable, end date of the geospatial
data; optional
-l, --lat-lims=REAL,REAL Latitude's upper and lower bounds; optional
-n, --lon-lims=REAL,REAL Longitude's upper and lower bounds; optional
-f, --shape-file=PATH Path to the ESRI '.shp' file; optional
-F, --fid=STR Column name representing elements of the
ESRI Shapefile to report statistics; optional
defaults to the first column
-j, --submit-job Submit the data extraction process as a job
on the SLURM system; optional
-t, --print-geotiff=BOOL Extract the subsetted GeoTIFF file; optional
[defaults to 'true']
-a, --stat=stat1[,stat2[...]] If applicable, extract the statistics of
interest, currently available options are:
'min';'max';'mean';'majority';'minority';
'median';'quantile';'variety';'variance';
'stdev';'coefficient_of_variation';'frac';
'coords'; 'count'; 'sum'; optional
-U, --include-na Include NA values in generated statistics;
optional
-q, --quantile=q1[,q2[...]] Quantiles of interest to be produced if 'quantile'
is included in the '--stat' argument. The values
must be comma delimited float numbers between
0 and 1; optional [defaults to every 5th quantile]
-p, --prefix=STR Prefix prepended to the output files
-b, --parsable Parsable SLURM message mainly used
for chained job submissions
-c, --cache=DIR Path of the cache directory; optional
-E, --email=STR E-mail when job starts, ends, and
fails; optional
-u, --account Digital Research Alliance of Canada's sponsor's
account name; optional, defaults to 'rpp-kshook'
-L, --lib-path Path to the shared libraries needed; optional,
see the source code for the default path
-V, --version Show version
-h, --help Show this screen and exit

For bug reports, questions, and discussions open an issue
at https://github.com/kasra-keshavarz/gistool/issues
```


# Available Datasets
|**#**|Dataset |Time Scale |CRS |DOI |Description |
|-----|--------------------------------------------|----------------------|-----|-------------------------------|---------------------|
|**1**|MODIS |2000 - 2021 | |10.5067/MODIS/MCD12Q1.006 |[link](modis) |
|**2**|MERIT Hydro |Not Applicable (N/A) |4326 |10.1029/2019WR024873 |[link](merit_hydro) |
|**3**|Soil Grids (v1) |Not Applicable (N/A) |4326 |10.1371/journal.pone.0169748 |[link](soil_grids) |
|**4**|Landsat NALCMS |2010 and 2015 |4326 |10.3390/rs9111098 |[link](landsat) |
|**5**|Global Depth to Bedrock |Not Applicable (N/A) | |10.1002/2016MS000686 |[link](depth_to_bedrock) |
|**6**|USDA Soil Class |Not Applicable (N/A) |4326 |10.4211/hs.1361509511e44adfba814f6950c6e742|[link](soil_class)|
|**7**|Global Soil Dataset for Earth System Modelling (GSDE)|Not Applicable (N/A)|4326 |10.1002/2013MS000293 |[link](GSDE) |

# General Example
As an example, follow the code block below. Please remember that you MUST have access to Graham cluster with Digital Alliance of Canada. Also, remember to generate a [Personal Access Token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) with GitHub in advance. Enter the following codes in your Graham shell as a test case:

```console
foo@bar:~$ git clone https://github.com/kasra-keshavarz/gistool.git # clone the repository
foo@bar:~$ cd ./gistool/ # always move to the repository's directory
foo@bar:~$ ./extract-gis.sh -h # view the usage message
foo@bar:~$ ./extract-gis.sh \
--dataset="merit-hydro" \
--dataset-dir="/project/rpp-kshook/CompHydCore/merit_hydro/raw_data/" \
--output-dir="$HOME/scratch/merit-hydro-test" \
--lat-lims="45,47" \
--lon-lims="-120,-117" \
--print-geotiff=true \
--variable="elv,hnd" \
--prefix="merit_test_";
```
See the [example](./example) directory for real-world scripts for each geospatial dataset included in this repository.


# Logs
The datasets logs are generated under the `$HOME/.gistool` directory,
only in cases where jobs are submitted to clusters' schedulers. If
processing is not submitted as a job, then the logs are printed on screen.


# `--lib-path` options
Currently, on Graham HPC, the following options are available:
```console
/project/rpp-kshook/Climate_Forcing_Data/assets/r-envs/ # default, rpp-kshook allocation
/project/rrg-mclark/lib # rrg-mclark allocation
```


# New Datasets
If you are considering any new dataset to be added to the data repository,
and subsequently the associated scripts added here, you can open a new
ticket on the **Issues** tab of the current repository. Or, you can make
a [Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request)
on this repository with your own script.


# Support
Please open a new ticket on the **Issues** tab of the current repository in case of any problem.

# Documentation
The relevant documentation is located on [Readthedocs](https://gistool.readthedocs.io/en/latest/) website.

# License
Geospatial Dataset Processing Workflow<br>
Geospatial Data Processing Workflow - gistool <br>
Copyright (C) 2022-2023, University of Saskatchewan<br>
Copyright (C) 2023-2024, University of Calgary<br>
Copyright (C) 2022-2024, datatool developers

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.1.7
0.3.0-dev
4 changes: 0 additions & 4 deletions assets/README.md

This file was deleted.

Loading