Open-source big data tools to handle various cryospheric remote sensing datasets.
... a method of storing data within a system or repository, in its natural format, that facilitates the collocation of data in various schemata and structural forms, usually object blobs or files... ~Wikipedia
Find the underlying data here used in this project (or at least links to the sources since they might be too big).
Examine the code here which mingles with the data to give some (hopefully) nice scientifically meaningful outputs (whatever that means). You may find some interesting dockerfiles and python3 code inside (if that clicks with you).
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
You have some form of git installed for version control. Ideally, docker should be installed too to fully replicate this scientific development environment, unless you do not have root/admin privilleges. For conda users, you may skip the docker install, but take note of the section below on setting up a conda environment.
For Debian/Ubuntu-based systems, you can try something like:
sudo apt install git docker-ce
Note: You may need to set-up the repository first to install docker-ce. See instructions for Debian and Ubuntu.
For Windows, if you have chocolatey (recommended!), it can be as easy as:
choco install git docker
For Mac OS X:
TODO??
With git installed, fire up your command prompt and do a git clone from this repo-url:
git clone <repo-url>
Alternatively, download the zip file from here, and unzip it.
The standard clone
code above will skip over some submodules, such as external tutorials I have cloned into the tuts folder.
To get absolutely everything (beware beware!), you can do:
git clone --recursive <repo-url>
You can replicate most of the libraries used in this repository by running:
conda env create --file=environment.yml
To try out the code (that downloads big data files, processes the data, etc) you can use a Jupyter lab or notebook environment. Do so by running either one of the below:
jupyter lab
jupyter notebook
Alternatively, you can use the atom-hydrogen-beta docker container here to ensure ease of reproducibility (aka mitigate denpendency hell problems). Yes, I like to do my code writing and execution inside that 'atom' docker container with interactive Hydrogen functionality!!
But of course, you can install the libraries yourself.
Feel free to submit a pull request or issue (nice ways of saying hi!) if you'd like to see something in here that's not here yet.
Any raw data (e.g. binary satellite files) used here is licensed accordingly as per the upstream source. Derived datasets are licensed under the Open Data Commons Attribution license unless otherwise stated.
Source code used in the handling of the data is licensed under the GNU Lesser General Public License v3.0.
Other forms of content (such as documentation) in this project repository which is not covered by the above two licenses is licensed under the Creative Commons Attribution Share Alike 4.0 License. Linked submodules (e.g. in the tuts folder) are subjected to their respective upstream licenses.