Skip to content

Accessing nested children via consolidated metadata fails #2358

Closed
@jhamman

Description

@jhamman

Zarr version

3.0.0.beta

Numcodecs version

0.13

Python Version

3.11

Operating System

Mac

Installation

pip

Description

In pydata/xarray#9552, I noticed that accessing nested children fails when using consolidated metadata.

Steps to reproduce

import zarr

store = zarr.storage.MemoryStore(mode='w')

# create hierarchy root + foo/bar
root = zarr.open_group(store=store, attributes={'a': 'b'}, mode='w')
root.create_array('foo/bar', shape=(2, 2), attributes={'d': 4})

# consolidate metadata
out = zarr.consolidate_metadata(store)

out['foo/bar']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py:670](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py#line=669), in AsyncGroup._getitem_consolidated(self, store_path, key, prefix)
    669 try:
--> 670     metadata = self.metadata.consolidated_metadata.metadata[key]
    671 except KeyError as e:
    672     # The Group Metadata has consolidated metadata, but the key
    673     # isn't present. We trust this to mean that the key isn't in
    674     # the hierarchy, and *don't* fall back to checking the store.

KeyError: 'foo[/bar](http://localhost:8888/bar)'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[20], line 12
      9 # consolidate metadata
     10 out = zarr.consolidate_metadata(store)
---> 12 out['foo[/bar](http://localhost:8888/bar)']

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py:1330](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py#line=1329), in Group.__getitem__(self, path)
   1329 def __getitem__(self, path: str) -> Array | Group:
-> 1330     obj = self._sync(self._async_group.getitem(path))
   1331     if isinstance(obj, AsyncArray):
   1332         return Array(obj)

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py:185](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py#line=184), in SyncMixin._sync(self, coroutine)
    182 def _sync(self, coroutine: Coroutine[Any, Any, T]) -> T:
    183     # TODO: refactor this to to take *args and **kwargs and pass those to the method
    184     # this should allow us to better type the sync wrapper
--> 185     return sync(
    186         coroutine,
    187         timeout=config.get("async.timeout"),
    188     )

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py:141](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py#line=140), in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py:100](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/sync.py#line=99), in _runner(coro)
     95 """
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 """
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py:608](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py#line=607), in AsyncGroup.getitem(self, key)
    606 # Consolidated metadata lets us avoid some I[/O](http://localhost:8888/O) operations so try that first.
    607 if self.metadata.consolidated_metadata is not None:
--> 608     return self._getitem_consolidated(store_path, key, prefix=self.name)
    610 # Note:
    611 # in zarr-python v2, we first check if `key` references an Array, else if `key` references
    612 # a group,using standalone `contains_array` and `contains_group` functions. These functions
    613 # are reusable, but for v3 they would perform redundant I[/O](http://localhost:8888/O) operations.
    614 # Not clear how much of that strategy we want to keep here.
    615 elif self.metadata.zarr_format == 3:

File [~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py:676](http://localhost:8888/lab/tree/~/Library/CloudStorage/Dropbox/src/zarr-python/src/zarr/core/group.py#line=675), in AsyncGroup._getitem_consolidated(self, store_path, key, prefix)
    671 except KeyError as e:
    672     # The Group Metadata has consolidated metadata, but the key
    673     # isn't present. We trust this to mean that the key isn't in
    674     # the hierarchy, and *don't* fall back to checking the store.
    675     msg = f"'{key}' not found in consolidated metadata."
--> 676     raise KeyError(msg) from e
    678 # update store_path to ensure that AsyncArray[/Group.name](http://localhost:8888/Group.name) is correct
    679 if prefix != "[/](http://localhost:8888/)":

KeyError: "'foo[/bar](http://localhost:8888/bar)' not found in consolidated metadata."

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions