Description
What happened?
Trying to open a datatree using the Zarr backend from a zarr file stored in a private S3 bucket leads to the following error:
GroupNotFoundError: group not found at path ''
This issue was already in the xarray-contrib/datatree, see xarray-contrib/datatree#322
The fix could be more or less the same, but at that time I did not take time to propose a PR.
What did you expect to happen?
The open_datatree
function from zarr.py
has a storage_options
argument. Yet this argument is not passed to the ZarrStore.open_store
.
Minimal Complete Verifiable Example
import xarray.backends.api as xr_api
storage_options = {
"s3": {
"key": [access-key]
"secret": [secret-key],
"endpoint_url": [endpoint-url]
}
}
dt=xr_api.open_datatree("s3://path/to/product",engine="zarr",storage_options=storage_options)
dt
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
No response
Anything else we need to know?
A possible fix could be, in xarray.backends.zarr.open_datatree:
filename_or_obj = _normalize_path(filename_or_obj)
if group:
parent = NodePath("/") / NodePath(group)
stores = ZarrStore.open_store(filename_or_obj, group=parent,storage_options=storage_options)
if not stores:
ds = open_dataset(
filename_or_obj, group=parent, engine="zarr", **kwargs
)
return DataTree.from_dict({str(parent): ds})
else:
parent = NodePath("/")
stores = ZarrStore.open_store(filename_or_obj, group=parent,storage_options=storage_options)
if storage_options:
kwargs["backend_kwargs"] = {"storage_options": storage_options}
ds = open_dataset(filename_or_obj, group=parent, engine="zarr", **kwargs)
As a summary:
- add
storage_options
inZarrStore.open_store
- set
backend_kwargs
inopen_dataset
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-113-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: ('fr_FR', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.3-development
xarray: 2024.6.0
pandas: 2.2.2
numpy: 2.0.0
scipy: 1.13.1
netCDF4: 1.7.1
pydap: None
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: 2.18.2
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.6.2
distributed: None
matplotlib: 3.9.0
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.6.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.5.1
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: 8.26.0
sphinx: None