Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Zarr V3 protocol #898

Merged
merged 121 commits into from
Mar 23, 2022
Merged

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Dec 1, 2021

This PR builds on #897, adding zarr v3 support to most of the high-level convenience functions (I haven't looked at the consolidated metadata yet).

EDIT: I closed the prior v3 PRs leading up to this one as their commits are also included here along with additional bug fixes and tests.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

Define the StoreV3 class and create v3 versions of most existing stores

Add a test_storage_v3.py with test classes inheriting from their v2
counterparts. Only a subset of methods involving differences in v3
behavior were overridden.
@codecov
Copy link

codecov bot commented Dec 1, 2021

Codecov Report

Merging #898 (7aa5dff) into master (b98c799) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##           master     #898     +/-   ##
=========================================
  Coverage   99.94%   99.95%             
=========================================
  Files          32       33      +1     
  Lines       11256    13622   +2366     
=========================================
+ Hits        11250    13616   +2366     
  Misses          6        6             
Impacted Files Coverage Δ
zarr/_storage/absstore.py 100.00% <100.00%> (ø)
zarr/_storage/store.py 100.00% <100.00%> (ø)
zarr/attrs.py 100.00% <100.00%> (ø)
zarr/convenience.py 100.00% <100.00%> (ø)
zarr/core.py 100.00% <100.00%> (ø)
zarr/creation.py 100.00% <100.00%> (ø)
zarr/hierarchy.py 100.00% <100.00%> (ø)
zarr/meta.py 100.00% <100.00%> (ø)
zarr/storage.py 100.00% <100.00%> (ø)
zarr/tests/test_attrs.py 100.00% <100.00%> (ø)
... and 11 more

@pep8speaks
Copy link

pep8speaks commented Dec 15, 2021

Hello @grlee77! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-03-11 03:47:51 UTC

@grlee77 grlee77 force-pushed the v3-store-part6 branch 2 times, most recently from de243b2 to 266ab58 Compare December 16, 2021 20:43
zarr_version should not be in the array metadata, only the base store metadata

compressor should be absent when there is no compression
No need for this class as DirectoryStoreV3 with / chunk separator can be used instead
@grlee77
Copy link
Contributor Author

grlee77 commented Mar 4, 2022

NestedDirectoryStoreV3 has been removed as discussed above: #898 (comment)

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking nice, @grlee77! A number of mostly aesthetic points below after an initial read. Don't think any of them are MUSTs before merging this into a dev branch. (See zarr-developers/governance#15)

The only other high-level comment, also aesthetic, is whether zarr._storage.v3 (or similar) would be appropriate earlier rather than later.

@@ -209,3 +205,63 @@ def getsize(self, path=None):

def clear(self):
self.rmdir()


def rmdir_abs(store: ABSStore, path=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a method on the store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I did it that way. I can make this ABSStore.rmdir and just call it from there in ABSStoreV3.rmdir

zarr/_storage/absstore.py Outdated Show resolved Hide resolved
zarr/_storage/store.py Outdated Show resolved Hide resolved
zarr/_storage/store.py Outdated Show resolved Hide resolved
zarr/_storage/store.py Outdated Show resolved Hide resolved
# assume path already normalized
prefix = _path_to_prefix(path)
for key in list(store.keys()):
if key.startswith(prefix):
del store[key]


def _rmdir_from_keys_v3(store: StoreV3, path: str = "") -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Knowing how this module is growing, is there any place relatively convenient that we could "hide" v3-specific methods?

key == 'zarr.json')

# cannot create a group without a path in v3
# so create /meta/root/consolidated group to store the metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of "/meta/root/consolidated" likely needs review at the zarr-specs level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened zarr-developers/zarr-specs#136 so we don't forget

store = _normalize_store_arg(store)
store = _normalize_store_arg(store, zarr_version=zarr_version)
if zarr_version is None:
zarr_version = getattr(store, '_store_version', 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth planning now for when the default swaps from 2 to 3? Either by using a constant or by documenting the locations while you have it in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I added DEFAULT_ZARR_VERSION (currently set to 2) in place of 2 in these checks.

It is intended as a constant used by developers, not something a user would set in their code to select the default version.

@joshmoore
Copy link
Member

Anything else planned from your side, @grlee77 ?

@grlee77
Copy link
Contributor Author

grlee77 commented Mar 15, 2022

Anything else planned from your side, @grlee77 ?

Before release we will need to decide on where we want the import path for v3 Stores to be (currently in the same file as the V2 ones). We can think about that and a possible environment variable to enable/disable v3 visibility in a followup, though.

@joshmoore
Copy link
Member

👍 for handling the refactoring and v3-switch in follow ups. As discussed during the past two community calls, getting this in to the mainline. 🎉 (And as promised for @grlee77: 🍪)

Clear warning for @zarr-developers/core-devs : the mainline is now in an unreleasable v3 state. If 2.11.x needs to go out, a dedicated v2 branch will need to be forked from a point before this merge.

See zarr-developers/governance#15 for some background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants