Skip to content

Cannot open zarr store stored in Azure blob file system #10209

Closed
@lsim-aegeri

Description

@lsim-aegeri

What happened?

I'm trying to save data to a zarr store hosted on my Azure blob filesystem. I'm able to write the data just fine, but then am unable to open the dataset afterwards--when I try to open it, I get an empty dataset with no variables. I believe this is related to the storage backend because everything works when I write the data to my local filesystem instead. I also don't think this is a zarr version issue because the same thing happens with zarr v2 and v3, and when I use consolidated metadata with zarr v2.

What did you expect to happen?

I should be able to read data from the zarr stores I write to my Azure blob filesystem.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np
import pandas as pd
import adlfs

ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(4, 5))},
    coords={
        "x": [10, 20, 30, 40],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcd")),
    },
)

store = 'abfs://weatherblob/xr-test/test.zarr-v3'
fs = adlfs.AzureBlobFileSystem(
    account_name="abcdefg",
    sas_token="ABCDEFG",
)
ds.to_zarr(
    store, 
    storage_options=fs.storage_options, 
    mode='w', 
    consolidated=False, 
    zarr_format=3, 
)

ds_zarr = xr.open_zarr(
    store, 
    storage_options=fs.storage_options, 
    consolidated=False, 
    zarr_format=3, 
)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

ds.info returns: 
Size: 248B
Dimensions:  (x: 4, y: 5)
Coordinates:
  * x        (x) int64 32B 10 20 30 40
  * y        (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05
    z        (x) <U1 16B 'a' 'b' 'c' 'd'
Data variables:
    foo      (x, y) float64 160B 0.768 0.1867 0.1145 ... 0.3455 0.7483 0.8156>

ds_zarr.info returns: 
Size: 0B
Dimensions:  ()
Data variables:
    *empty*>

Also, the blob storage bucket has data at the zarr's path. 
* All the .json and chunk files that I'd expect are there. 
* Active blobs: 9 blobs, 3.04 KiB (3,110 bytes).

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.9 (main, Apr 6 2024, 17:59:24) [GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-1071-azure
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2025.3.1
pandas: 2.2.3
numpy: 2.1.3
scipy: 1.15.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: 3.0.6
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2025.3.0
distributed: 2025.3.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2025.3.2
cupy: None
pint: None
sparse: 0.16.0
flox: 0.10.2
numpy_groupies: 0.11.2
setuptools: None
pip: None
conda: None
pytest: None
mypy: None
IPython: 9.1.0
sphinx: None
adlfs: 2024.12.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugtopic-zarrRelated to zarr storage library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions