Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain and Enforce Xarray Standards #817

Open
evrose54 opened this issue Nov 5, 2024 · 0 comments
Open

Maintain and Enforce Xarray Standards #817

evrose54 opened this issue Nov 5, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@evrose54
Copy link
Contributor

evrose54 commented Nov 5, 2024

Requested Update

Description

We currently provide documentation for standardizing what is included in xarray datasets. This includes standards for coordinates and metadata. While this documentation does a good job describing what a developer and/or user should implement, it's clear that in practice it's not always followed.

We should update this documentation to reflect the exact information we want included in our xarray datasets. This way, we can review all of our readers and update them to adhere to this protocol.

We should address the following concerns and update all readers correspondingly:

  • source_file_start_datetimes (this should correspond to the start_datetime that is in the individual metadata for that file)
  • source_file_end_datetimes (this should correspond to the end_datetime that is in the individual metadata for that file)
  • source_file_filename_datetimes (this is SPECIFICALLY the datetime in the filename itself, which may or may not match the start datetime. This is optional. Sometimes files are WRONG with the datetime that is stored in the filename, so this is really just a sanity check, that is not really always required)
  • source_file_attributes (this is good as it's in 312 bug fix update geostationary readers to support multiple scan times #427)
  • source_file_names (this is good as it's in 312 bug fix update geostationary readers to support multiple scan times #427)
  • Should we add {x,y}_{add_offset,scale_factor} to required metadata for our datasets? OCTOPY would benefit from this.
  • Enforce the orientation of data coming out of the readers is consistent. For this I recommend data.{y,lat}[0] is northernmost , data.{y,lat}[-1] is southernmost and data.{x,lon}[0] is westernmost, data.{x,lon}[-1] is easternmost.
  • Enforce consistent ordering and naming of our coordinates / dimensions. Mindy suggested this: "date or time", "height or depth", "latitude", or "longitude". See NetCDF Climate and Forecast (CF) Metadata Conventions for more information.

Background and Motivation

This issue largely stems from #427, which updated some readers to be able to handle multiple scan times of data. This resulted in new metadata being added to xarray datasets, as well as new dimensions. These updates made it clear that we need to formalize our standards for xarray datasets, so that we as developers can easily expect what is in the output of all our GeoIPS readers.

Code to demonstrate issue

These readers in PR #427 demonstrate a lot of what I'm talking about above, but in reality all readers should demonstrate this functionality.

Checklist for Completion

  • See the concerns above. Once we finalize what we want, we need to implement that functionality in all of our readers.
@evrose54 evrose54 added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant