Skip to content

automatic chunking of zarr archive #4046

Open
@apatlpo

Description

@apatlpo

I store data in a zarr archive that is not chunked and the resulting zarr archive is chunked.
This may be as simple usage question.
I don't know how to turn this behavior off.

Code sample

Here is minimal example that reproduces the issue:

ds = xr.DataArray(np.ones((200,800))).rename('foo').to_dataset()
print('Initial chunks = {}'.format(ds.foo.chunks))
ds.to_zarr('test.zarr', mode='w')
print('zarr archives contains: {}'.format(os.listdir('test.zarr/foo')))
ds = xr.open_zarr('test.zarr')
print('Final chunks = {}'.format(ds.foo.chunks))

returns:

Initial chunks = None
zarr archives contains: ['.zarray', '.zattrs', '0.0', '0.1', '1.0', '1.1']
Final chunks = ((100, 100), (400, 400))

Expected Output

I would expect the archive to not to be chunked.

Versions

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.12.53-60.30-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.15.2.dev29+g6048356
pandas: 1.0.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.4.0
cftime: 1.1.1.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.13.0
distributed: 2.13.0
matplotlib: 3.2.1
cartopy: 0.17.0
seaborn: 0.10.0
numbagg: None
pint: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: None
IPython: 7.13.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions