Skip to content

Allow Stores to opt out of consolidated metadata. #3119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/user-guide/consolidated_metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,3 +114,23 @@ removed, or modified, consolidated metadata may not be desirable.
metadata.

.. _Consolidated Metadata: https://github.com/zarr-developers/zarr-specs/pull/309

Stores Without Support for Consolidated Metadata
------------------------------------------------

Some stores may want to opt out of the conolidated metadata mechanism. This
may be for several reasons like:

* They want to maintain read-write consistency, which is challenging with
consolidated metadata.
* They have their own consolidated metadata mechanism.
* They offer good enough performance without need for consolidation.

This type of store can declare it doesn't want consolidation by implementing
`Store.supports_consolidated_metadata`. For stores that don't support
consolidation, Zarr will:

* Raise an error on `consolidate_metadata` calls, maintaining the store in
its unconsolidated state.
* Raise an error in `AsyncGroup.open(..., use_consolidated=True)`
* Not use consolidated metadata in `AsyncGroup.open(..., use_consolidated=None)`
12 changes: 12 additions & 0 deletions src/zarr/abc/store.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,18 @@
"""
await gather(*starmap(self.set, values))

@property
def supports_consolidated_metadata(self) -> bool:
"""
Does the store support consolidated metadata?.

If it doesn't an error will be raised on requests to consolidate the metadata.
Returning `False` can be useful for stores which implement their own
consolidation mechanism outside of the zarr-python implementation.
"""

return True

Check warning on line 277 in src/zarr/abc/store.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/abc/store.py#L277

Added line #L277 was not covered by tests

@property
@abstractmethod
def supports_deletes(self) -> bool:
Expand Down
13 changes: 11 additions & 2 deletions src/zarr/api/asynchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,8 @@
Consolidate the metadata of all nodes in a hierarchy.

Upon completion, the metadata of the root node in the Zarr hierarchy will be
updated to include all the metadata of child nodes.
updated to include all the metadata of child nodes. For Stores that do
not support consolidated metadata, this operation raises a ``TypeError``.

Parameters
----------
Expand All @@ -194,10 +195,18 @@
-------
group: AsyncGroup
The group, with the ``consolidated_metadata`` field set to include
the metadata of each child node.
the metadata of each child node. If the Store doesn't support
consolidated metadata, this function raises a `TypeError`.
See ``Store.supports_consolidated_metadata``.
"""
store_path = await make_store_path(store, path=path)

if not store_path.store.supports_consolidated_metadata:
store_name = type(store_path.store).__name__
raise TypeError(

Check warning on line 206 in src/zarr/api/asynchronous.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/api/asynchronous.py#L204-L206

Added lines #L204 - L206 were not covered by tests
f"The Zarr Store in use ({store_name}) doesn't support consolidated metadata",
)

group = await AsyncGroup.open(store_path, zarr_format=zarr_format, use_consolidated=False)
group.store_path.store._check_writable()

Expand Down
8 changes: 6 additions & 2 deletions src/zarr/api/synchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@ def consolidate_metadata(
Consolidate the metadata of all nodes in a hierarchy.

Upon completion, the metadata of the root node in the Zarr hierarchy will be
updated to include all the metadata of child nodes.
updated to include all the metadata of child nodes. For Stores that do
not use consolidated metadata, this operation raises a `TypeError`.

Parameters
----------
Expand All @@ -101,7 +102,10 @@ def consolidate_metadata(
-------
group: Group
The group, with the ``consolidated_metadata`` field set to include
the metadata of each child node.
the metadata of each child node. If the Store doesn't support
consolidated metadata, this function raises a `TypeError`.
See ``Store.supports_consolidated_metadata``.

"""
return Group(sync(async_api.consolidate_metadata(store, path=path, zarr_format=zarr_format)))

Expand Down
17 changes: 14 additions & 3 deletions src/zarr/core/group.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,10 +490,11 @@

By default, consolidated metadata is used if it's present in the
store (in the ``zarr.json`` for Zarr format 3 and in the ``.zmetadata`` file
for Zarr format 2).
for Zarr format 2) and the Store supports it.

To explicitly require consolidated metadata, set ``use_consolidated=True``,
which will raise an exception if consolidated metadata is not found.
To explicitly require consolidated metadata, set ``use_consolidated=True``.
In this case, if the Store doesn't support consolidation or consolidated metadata is
not found, a ``ValueError`` exception is raised.

To explicitly *not* use consolidated metadata, set ``use_consolidated=False``,
which will fall back to using the regular, non consolidated metadata.
Expand All @@ -503,6 +504,16 @@
to load consolidated metadata from a non-default key.
"""
store_path = await make_store_path(store)
if not store_path.store.supports_consolidated_metadata:

Check warning on line 507 in src/zarr/core/group.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/core/group.py#L507

Added line #L507 was not covered by tests
# Fail if consolidated metadata was requested but the Store doesn't support it
if use_consolidated:
store_name = type(store_path.store).__name__
raise ValueError(

Check warning on line 511 in src/zarr/core/group.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/core/group.py#L509-L511

Added lines #L509 - L511 were not covered by tests
f"The Zarr store in use ({store_name}) doesn't support consolidated metadata."
)

# if use_consolidated was None (optional), the Store dictates it doesn't want consolidation
use_consolidated = False

Check warning on line 516 in src/zarr/core/group.py

View check run for this annotation

Codecov / codecov/patch

src/zarr/core/group.py#L516

Added line #L516 was not covered by tests

consolidated_key = ZMETADATA_V2_JSON

Expand Down
35 changes: 35 additions & 0 deletions tests/test_metadata/test_consolidated.py
Original file line number Diff line number Diff line change
Expand Up @@ -651,3 +651,38 @@ async def test_consolidated_metadata_encodes_special_chars(
elif zarr_format == 3:
assert root_metadata["child"]["attributes"]["test"] == expected_fill_value
assert root_metadata["time"]["fill_value"] == expected_fill_value


class NonConsolidatedStore(zarr.storage.MemoryStore):
"""A store that doesn't support consolidated metadata"""

@property
def supports_consolidated_metadata(self) -> bool:
return False


async def test_consolidate_metadata_raises_for_self_consolidating_stores():
"""Verify calling consolidate_metadata on a non supporting stores raises an error."""

memory_store = NonConsolidatedStore()
root = await zarr.api.asynchronous.create_group(store=memory_store)
await root.create_group("a/b")

with pytest.raises(TypeError, match="doesn't support consolidated metadata"):
await zarr.api.asynchronous.consolidate_metadata(memory_store)


async def test_open_group_in_non_consolidating_stores():
memory_store = NonConsolidatedStore()
root = await zarr.api.asynchronous.create_group(store=memory_store)
await root.create_group("a/b")

# Opening a group without consolidatedion works as expected
await AsyncGroup.open(memory_store, use_consolidated=False)

# let the Store opt out of consolidation
await AsyncGroup.open(memory_store, use_consolidated=None)

# Opening a group with use_consolidated=True should fail
with pytest.raises(ValueError, match="doesn't support consolidated metadata"):
await AsyncGroup.open(memory_store, use_consolidated=True)