This repository contains example code for working with the Environmental Science Data Corpus (ESDC) in Training Data Markup Language - Artificial Intelligence (TrainingDML-AI) format. The ESDC is made publicly available to ensure that it is discoverable and reusable by researchers in the field of environmental science and machine learning.
The source corpus is available at https://doi.org/10.57760/sciencedb.10436, provided in the TrainingDML-AI format. TrainingDML-AI, proposed by Yue et al. (2022), is a format aligned with OGC standards and optimized for the exchange and retrieval of geospatial machine learning training data.
To use these scripts, clone the repository and install the required libraries as listed in requirements.txt
.
- Python 3.8