Skip to content

Dataset.groupby reductions give "Dataset does not contain dimensions error" in v0.13 #3337

Closed
@nbren12

Description

@nbren12

MCVE Code Sample

>>> ds = xr.DataArray(np.ones((4,5)), dims=['z', 'x']).to_dataset(name='a')
>>> ds.a.groupby('z').mean()
<xarray.DataArray 'a' (z: 4)>
array([1., 1., 1., 1.])
Dimensions without coordinates: z
>>> ds.groupby('z').mean()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/common.py", line 91, in wrapped_func
    **kwargs
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/groupby.py", line 848, in reduce
    return self.apply(reduce_dataset)
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/groupby.py", line 796, in apply
    return self._combine(applied)
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/groupby.py", line 800, in _combine
    applied_example, applied = peek_at(applied)
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/utils.py", line 181, in peek_at
    peek = next(gen)
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/groupby.py", line 795, in <genexpr>
    applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped())
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/groupby.py", line 846, in reduce_dataset
    return ds.reduce(func, dim, keep_attrs, **kwargs)
  File "/Users/noah/miniconda3/envs/broken/lib/python3.7/site-packages/xarray/core/dataset.py", line 3888, in reduce
    "Dataset does not contain the dimensions: %s" % missing_dimensions
ValueError: Dataset does not contain the dimensions: ['z']
>>> ds.dims
Frozen(SortedKeysDict({'z': 4, 'x': 5}))

Problem Description

Groupby reduction operations on Dataset objects no longer seem to work in xarray v0.13. In the example, above I create an xarray dataset with one dataarray called "a". The same groupby operations fails on this Dataset, but succeeds when called directly on "a". Is this a bug or an intended change?

In addition the error message is confusing since z is one of the Dataset dimensions.

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None

xarray: 0.13.0
pandas: 0.25.1
numpy: 1.17.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.2.0
pip: 19.2.3
conda: None
pytest: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions