Skip to content

.chunk() doesn't create chunks on 0 dim arrays #8251

@max-sixty

Description

@max-sixty

What happened?

.chunk's docstring states:

        """Coerce this array's data into a dask arrays with the given chunks.

        If this variable is a non-dask array, it will be converted to dask
        array. If it's a dask array, it will be rechunked to the given chunk
        sizes.

...but this doesn't happen for 0 dim arrays; example below.

For context, as part of #8245, I had a function that creates a template array. It created an empty DataArray, then expanded dims for each dimension. And it kept blowing up memory! ...until I realized that it was actually not a lazy array.

What did you expect to happen?

It may be that we can't have a 0-dim dask array — but then we should raise in this method, rather than return the wrong thing.

Minimal Complete Verifiable Example

[ins] In [1]: type(xr.DataArray().chunk().data)
Out[1]: numpy.ndarray

[ins] In [2]: type(xr.DataArray(1).chunk().data)
Out[2]: numpy.ndarray

[ins] In [3]: type(xr.DataArray([1]).chunk().data)
Out[3]: dask.array.core.Array

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: 0d6cd2a
python: 3.9.18 (main, Aug 24 2023, 21:19:58)
[Clang 14.0.3 (clang-1403.0.22.14.1)]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: en_US.UTF-8
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.8.1.dev25+g8215911a.d20230914
pandas: 2.1.1
numpy: 1.25.2
scipy: 1.11.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.16.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.4.0
distributed: 2023.7.1
matplotlib: 3.5.1
cartopy: None
seaborn: None
numbagg: 0.2.3.dev30+gd26e29e
fsspec: 2021.11.1
cupy: None
pint: None
sparse: None
flox: 0.7.2
numpy_groupies: 0.9.19
setuptools: 68.1.2
pip: 23.2.1
conda: None
pytest: 7.4.0
mypy: 1.5.1
IPython: 8.15.0
sphinx: 4.3.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugtopic-zarrRelated to zarr storage library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions