-
-
Notifications
You must be signed in to change notification settings - Fork 365
Description
Zarr version
v3
Numcodecs version
v0.15.1
Python Version
3.12
Operating System
Linux
Installation
Mamba
Description
In zarr v2, passing a path (str) ending with *.zip to zarr.open or zarr.open_consolidated would work transparently, the same way as is a *.zarr path was given.
In zarr v3, this fails with FileNotFoundError : Unable to find group. The way around is to open a ZipStore explicitely and then pass that to zarr.open.
The zarr.open docstring says:
zarr-python/src/zarr/api/synchronous.py
Lines 166 to 169 in 99621ec
| Parameters | |
| ---------- | |
| store : Store or str, optional | |
| Store or path to directory in file system or name of zip file. |
which seems to imply that passing a string path ending in
zip should work.
Moreover, I found it more convenient when one didn't need to test for file extensions and explicitly handle storage objects. I use zarr dataset through xarray and it seems to me that xr.open_dataset('example.zarr.zip', engine='zarr') should usually be how a normal user should open a such a file ?
I understand that the ZipStore is still « experimental » in v3, and I really hope you keep it in the officiel scheme because it is very useful, to us at least. Zipped zarrs have many of the benefits of zarr (over netCDF for example), but without the inode-explosion that pure zarr folders create on unix filesystems (slowing down the disk operations).
I think I see that the store guessing happens in zarr.storage._common.make_store_path ?
Like it could happen here:
zarr-python/src/zarr/storage/_common.py
Lines 298 to 309 in 99621ec
| elif isinstance(store_like, Path): | |
| store = await LocalStore.open(root=store_like, read_only=_read_only) | |
| elif isinstance(store_like, str): | |
| storage_options = storage_options or {} | |
| if _is_fsspec_uri(store_like): | |
| used_storage_options = True | |
| store = FsspecStore.from_url( | |
| store_like, storage_options=storage_options, read_only=_read_only | |
| ) | |
| else: | |
| store = await LocalStore.open(root=Path(store_like), read_only=_read_only) |
Would this convenience be welcomed back in zarr-python ? I could do a PR if the team here agrees with adding this case handling. To avoid pure string checking, one could even use zipfile.is_zipfile from the standard library to check for zip stores ?
Otherwise, I guess this could be done by xarray itself ? Many of my scripts go through intake-esm also, I guess we could fix it there too if the proposal gets refused here.
Steps to reproduce
Example adapted from the doc.
import numpy as np
import zarr
store = zarr.storage.ZipStore("example-3.zip", mode='w')
z = zarr.create_array(
store=store,
shape=(100, 100),
chunks=(10, 10),
dtype="f4"
)
z[:, :] = np.random.random((100, 100))
store.close()
zarr.open('example-3.zip', mode='r')Additional output
No response