-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr as a "universal reader" for netCDF etc., via new CF decoding codecs #303
Comments
Thanks for the writeup Tom, a big +1 from me on this effort.
From glancing at the signature and a few implementations, it looks like the |
Does an |
Idea: Use zarr readers to open and decode netCDF/HDF/etc. data without xarray by lifting xarray's decoding machinery out as new zarr codecs.
This was suggested by @sharkinsspatial in zarr-developers/VirtualiZarr#68 (comment) and requires two components:
To be really useful this probably also requires variable-length chunking in zarr (i.e. ZEP003).
The advantages of this are:
a) a clearer separation of concerns, with fewer "magic" steps hidden inside xarray,
b) applications that can read zarr but don't want to use xarray could also read and fully decode netCDF data (i.e. pure-zarr users see the same data as xarray users),
c) clearer steps towards generalizing to non-CF encoding conventions used in other domains of science,
d) opening the door to zarr becoming a "universal reader" of any file format whose data can be expressed as a manifest of byte ranges and decoding steps can be expressed as zarr codecs.
Most of the work here would be on the xarray end - there is an ancient issue suggesting something similar in pydata/xarray#155, and a nice explanation of how xarray currently does this step in pydata/xarray#8548. Currently it looks essentially like this
where one of xarray's options for
datastore
is for zarr, and another is for netCDF (these are xarray's "backends"). I'm proposing something more likewhere non-xarray users can still get all of
zarr.Array < CF decoding (using new zarr codecs) < open via "universal" zarr reader < chunk manifest < file
One question is how well does xarray's internal concept of a
VariableCoder
map onto a zarr codec?The text was updated successfully, but these errors were encountered: