Skip to content

Commit b3bafee

Browse files
author
Spencer Hill
authored
Fix for stack+groupby+apply w/ non-increasing coord (#3906)
* WIP fix for stack+groupby+apply w/ non-increasing coord * Only sort if multiindex * store safe_cast_to_index call to avoid double call * black formatting
1 parent eb8e43b commit b3bafee

File tree

3 files changed

+29
-5
lines changed

3 files changed

+29
-5
lines changed

doc/whats-new.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ New Features
4545

4646
Bug fixes
4747
~~~~~~~~~
48+
- Fix renaming of coords when one or more stacked coords is not in
49+
sorted order during stack+groupby+apply operations. (:issue:`3287`,
50+
:pull:`3906`) By `Spencer Hill <https://github.com/spencerahill>`_
4851
- Fix a regression where deleting a coordinate from a copied :py:class:`DataArray`
4952
can affect the original :py:class:`Dataarray`. (:issue:`3899`, :pull:`3871`)
5053
By `Todd Jennings <https://github.com/toddrjen>`_
@@ -96,13 +99,13 @@ New Features
9699
- Added support for :py:class:`pandas.DatetimeIndex`-style rounding of
97100
``cftime.datetime`` objects directly via a :py:class:`CFTimeIndex` or via the
98101
:py:class:`~core.accessor_dt.DatetimeAccessor`.
99-
By `Spencer Clark <https://github.com/spencerkclark>`_
102+
By `Spencer Clark <https://github.com/spencerkclark>`_
100103
- Support new h5netcdf backend keyword `phony_dims` (available from h5netcdf
101104
v0.8.0 for :py:class:`~xarray.backends.H5NetCDFStore`.
102105
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
103106
- Add partial support for unit aware arrays with pint. (:pull:`3706`, :pull:`3611`)
104107
By `Justus Magin <https://github.com/keewis>`_.
105-
- :py:meth:`Dataset.groupby` and :py:meth:`DataArray.groupby` now raise a
108+
- :py:meth:`Dataset.groupby` and :py:meth:`DataArray.groupby` now raise a
106109
`TypeError` on multiple string arguments. Receiving multiple string arguments
107110
often means a user is attempting to pass multiple dimensions as separate
108111
arguments and should instead pass a single list of dimensions.
@@ -120,7 +123,7 @@ New Features
120123
By `Maximilian Roos <https://github.com/max-sixty>`_.
121124
- ``skipna`` is available in :py:meth:`Dataset.quantile`, :py:meth:`DataArray.quantile`,
122125
:py:meth:`core.groupby.DatasetGroupBy.quantile`, :py:meth:`core.groupby.DataArrayGroupBy.quantile`
123-
(:issue:`3843`, :pull:`3844`)
126+
(:issue:`3843`, :pull:`3844`)
124127
By `Aaron Spring <https://github.com/aaronspring>`_.
125128

126129
Bug fixes
@@ -814,7 +817,7 @@ Bug fixes
814817
Documentation
815818
~~~~~~~~~~~~~
816819

817-
- Created a `PR checklist <https://xarray.pydata.org/en/stable/contributing.html/contributing.html#pr-checklist>`_
820+
- Created a `PR checklist <https://xarray.pydata.org/en/stable/contributing.html/contributing.html#pr-checklist>`_
818821
as a quick reference for tasks before creating a new PR
819822
or pushing new commits.
820823
By `Gregory Gundersen <https://github.com/gwgundersen>`_.

xarray/core/groupby.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -370,8 +370,10 @@ def __init__(
370370
group = group.dropna(group_dim)
371371

372372
# look through group to find the unique values
373+
group_as_index = safe_cast_to_index(group)
374+
sort = bins is None and (not isinstance(group_as_index, pd.MultiIndex))
373375
unique_values, group_indices = unique_value_groups(
374-
safe_cast_to_index(group), sort=(bins is None)
376+
group_as_index, sort=sort
375377
)
376378
unique_coord = IndexVariable(group.name, unique_values)
377379

xarray/tests/test_dataarray.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1202,6 +1202,25 @@ def test_selection_multiindex_from_level(self):
12021202
expected = data.isel(xy=[0, 1]).unstack("xy").squeeze("y").drop_vars("y")
12031203
assert_equal(actual, expected)
12041204

1205+
def test_stack_groupby_unsorted_coord(self):
1206+
data = [[0, 1], [2, 3]]
1207+
data_flat = [0, 1, 2, 3]
1208+
dims = ["x", "y"]
1209+
y_vals = [2, 3]
1210+
1211+
arr = xr.DataArray(data, dims=dims, coords={"y": y_vals})
1212+
actual1 = arr.stack(z=dims).groupby("z").first()
1213+
midx1 = pd.MultiIndex.from_product([[0, 1], [2, 3]], names=dims)
1214+
expected1 = xr.DataArray(data_flat, dims=["z"], coords={"z": midx1})
1215+
xr.testing.assert_equal(actual1, expected1)
1216+
1217+
# GH: 3287. Note that y coord values are not in sorted order.
1218+
arr = xr.DataArray(data, dims=dims, coords={"y": y_vals[::-1]})
1219+
actual2 = arr.stack(z=dims).groupby("z").first()
1220+
midx2 = pd.MultiIndex.from_product([[0, 1], [3, 2]], names=dims)
1221+
expected2 = xr.DataArray(data_flat, dims=["z"], coords={"z": midx2})
1222+
xr.testing.assert_equal(actual2, expected2)
1223+
12051224
def test_virtual_default_coords(self):
12061225
array = DataArray(np.zeros((5,)), dims="x")
12071226
expected = DataArray(range(5), dims="x", name="x")

0 commit comments

Comments
 (0)