Error when rechunking from Zarr store

My assumption for this is that it should be possible to:

1. Write to a zarr store with some chunk size along a dimension
2. Load from that zarr store and rechunk to a multiple of that chunk size
3. Write that result to another zarr store

However I see this behavior instead:

```python
import xarray as xr
import dask.array as da

ds = xr.Dataset(dict(
    x=xr.DataArray(da.random.random(size=100, chunks=10), dims='d1')
))

# Write the store
ds.to_zarr('/tmp/ds1.zarr', mode='w') 

# Read it out, rechunk it, and attempt to write it again
xr.open_zarr('/tmp/ds1.zarr').chunk(chunks=dict(d1=20)).to_zarr('/tmp/ds2.zarr', mode='w')

ValueError: Final chunk of Zarr array must be the same size or smaller than the first. 
Specified Zarr chunk encoding['chunks']=(10,), for variable named 'x' but (20, 20, 20, 20, 20) 
in the variable's Dask chunks ((20, 20, 20, 20, 20),) is incompatible with this encoding. 
Consider either rechunking using `chunk()` or instead deleting or modifying `encoding['chunks']`.
```

<details><summary>Full trace</summary>
<pre>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-122-e185759d81c5> in <module>
----> 1 xr.open_zarr('/tmp/ds1.zarr').chunk(chunks=dict(d1=20)).to_zarr('/tmp/ds2.zarr', mode='w')

/opt/conda/lib/python3.7/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim)
   1656             compute=compute,
   1657             consolidated=consolidated,
-> 1658             append_dim=append_dim,
   1659         )
   1660 

/opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim)
   1351     writer = ArrayWriter()
   1352     # TODO: figure out how to properly handle unlimited_dims
-> 1353     dump_to_store(dataset, zstore, writer, encoding=encoding)
   1354     writes = writer.sync(compute=compute)
   1355 

/opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
   1126         variables, attrs = encoder(variables, attrs)
   1127 
-> 1128     store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
   1129 
   1130 

/opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
    411         self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims)
    412         self.set_variables(
--> 413             variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims
    414         )
    415 

/opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims)
    466                 # new variable
    467                 encoding = extract_zarr_variable_encoding(
--> 468                     v, raise_on_invalid=check, name=vn
    469                 )
    470                 encoded_attrs = {}

/opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in extract_zarr_variable_encoding(variable, raise_on_invalid, name)
    214 
    215     chunks = _determine_zarr_chunks(
--> 216         encoding.get("chunks"), variable.chunks, variable.ndim, name
    217     )
    218     encoding["chunks"] = chunks

/opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name)
    154             if dchunks[-1] > zchunk:
    155                 raise ValueError(
--> 156                     "Final chunk of Zarr array must be the same size or "
    157                     "smaller than the first. "
    158                     f"Specified Zarr chunk encoding['chunks']={enc_chunks_tuple}, "

ValueError: Final chunk of Zarr array must be the same size or smaller than the first. Specified Zarr chunk encoding['chunks']=(10,), for variable named 'x' but (20, 20, 20, 20, 20) in the variable's Dask chunks ((20, 20, 20, 20, 20),) is incompatible with this encoding. Consider either rechunking using `chunk()` or instead deleting or modifying `encoding['chunks']`.
</pre>
</details>


Overwriting chunks on `open_zarr` with `overwrite_encoded_chunks=True` works but I don't want that because it requires providing a uniform chunk size for all variables.  This workaround seems to be fine though:

```python
ds = xr.open_zarr('/tmp/ds1.zarr')
del ds.x.encoding['chunks']
ds.chunk(chunks=dict(d1=20)).to_zarr('/tmp/ds2.zarr', mode='w')
```

Does `encoding['chunks']` serve any purpose after you've loaded a zarr store and all the variables are defined as dask arrays?  In other words, Is there any harm in deleting it from all dask variables if I want those variables to write back out to zarr using the dask chunk definitions instead?

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jun  1 2020, 18:57:50) 
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-42-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: None

xarray: 0.16.0
pandas: 1.0.5
numpy: 1.19.0
scipy: 1.5.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.21.0
distributed: 2.21.0
matplotlib: 3.3.0
cartopy: None
seaborn: 0.10.1
numbagg: None
pint: None
setuptools: 47.3.1.post20200616
pip: 20.1.1
conda: 4.8.2
pytest: 5.4.3
IPython: 7.15.0
sphinx: 3.2.1


</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Error when rechunking from Zarr store #4380

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Error when rechunking from Zarr store #4380

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions