-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZEP 2: Consider versioning #277
Comments
I was wondering if the entire Zarr spec should be versioned, instead of versioning extensions separately. Otherwise, we might need to track incompatibilities among extension versions. |
I'm in favour of this, as the spec is continuing to evolve, although would it imply any backwards compatibility? If a field was renamed in 3.3, would a 3.3-compliant implementation be expected to support the 3.2 name? Also, where ZEPs represent distinct features like sharding codec, how would an implementation indicate that they supported e.g. ZEP3 but not ZEP2? |
For an example of a community that has successfully used semver in a spec with multiple implementations, check out STAC: https://github.com/radiantearth/stac-spec/blob/master/process.md |
Can you provide some pointers to examples of how the version is actually used, e.g. examples of code in STAC implementations that handle multiple versions? |
PyStac is a good example: https://github.com/stac-utils/pystac/blob/main/pystac/version.py This is used, e.g. to migrate versions of a catalog: https://github.com/stac-utils/pystac/blob/master/tests/data-files/change_stac_version.py edit: I think one of the most useful ways semver was used for STAC was for before the 1.0 release, so the implementation community could incrementally move towards the stable version in discrete steps, rather than all at once. |
I previously advocated against "version numbers" --- I'll reiterate the arguments I made in previous discussions: As the spec/format evolves, various new functionality will be added, and implementations will evolve to support subsets of that functionality. However, a version number assumes a linear progression. For example, let's say we want to know what functionality is supported by a given zarr v3 implementation. We can't necessarily convey that by just saying it supports zarr version 3.5, because it may support some features but not others. Instead, I prefer the HTML model where you have a table that indicates for each feature the minimum version of each implementation that supports it (along with any relevant notes if there are caveats). If we store the version number in the array metadata, then presumably it gets set when creating the array. But which version number do we choose? Let's say a given zarr implementation ImplA supports all of the required functionality of zarr version 3.7. It might then always specify "version": "3.7" when creating an array. But then a zarr implementation ImplB that hasn't been updated since version 3.6 of the spec was published might decide that it cannot read this array, even if in fact ImplB supports all of the features used by the array. Instead we could add logic to ImplA to choose the minimum version number of the spec that includes all of the features used by the array. But that adds implementation complexity, and given the possibility of optional features or that an implementation may add support for some but not all of the functionality added in a given zarr version, still does not really help other implementations determine whether they support a given array. An additional problem with version numbers is that as we develop the spec, e.g. adding an JSON plus the requirement that all attributes must be known unless they are marked with {"must_understand":false} was specifically intended to facilitate format evolution in a backwards-compatible way without the need for a version number. (We may want to revise the |
It would be useful to include versioning in the shard implementation to allow improvements over time. This might make sense in the metadata
The text was updated successfully, but these errors were encountered: