Skip to content

[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

Closed
@headtr1ck

Description

@headtr1ck

What happened?

I tried to reduce some dimensions using where (sel did not work in this case) and shorten the dimensions with "drop=True".
This works fine on DataArrays and Datasets with only a single dimension but fails as soon as you have a Dataset with two dimensions on different variables.
The dimensions are left untouched and you have NaNs in the data, just as if you were using "drop=False" (see example).

I am actually not sure what the expected behavior is, maybe I am wrong and it is correct due to some broadcasting rules?

What did you expect to happen?

I expected that relevant dims are shortened.
If the ds.where with "drop=False" all variables along a dimenions have some NaNs, then using "drop=True" I expect these dimensions to be shortened and the NaNs removed.

Minimal Complete Verifiable Example

import xarray as xr

# this works
ds = xr.Dataset({"a": ("x", [1, 2 ,3])})
ds.where(ds > 2, drop=True)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 1)
# Dimensions without coordinates: x
# Data variables:
#     a        (x) float64 3.0

# this doesn't
ds = xr.Dataset({"a": ("x", [1, 2 ,3]), "b": ("y", [2, 3, 4])})
ds.where(ds > 2, drop=True)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 3, y: 3)
# Dimensions without coordinates: x, y
# Data variables:
#     a        (x) float64 nan nan 3.0
#     b        (y) float64 nan 3.0 4.0

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.9.1 (default, Jan 13 2021, 15:21:08)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.49.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 0.20.2
pandas: 1.3.5
numpy: 1.21.5
scipy: 1.7.3
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.1.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.5.1
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
setuptools: 49.2.1
pip: 22.0.2
conda: None
pytest: 6.2.5
IPython: 8.0.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions