Skip to content

More groupby indexing problems #7797

Closed
@slevang

Description

@slevang

What happened?

There is still something wrong with the groupby indexing changes from 2023.04.0 onward. Here are two typical ways of doing a seasonal anomaly calculation, which return the same result on 2023.03.0 but are different on 2023.04.2:

import numpy as np
import xarray as xr

# monthly timeseries that should return "zero anomalies" everywhere
time = xr.date_range("2023-01-01", "2023-12-31", freq="MS")
data = np.linspace(-1, 1, 12)
x = xr.DataArray(data, coords={"time": time})
clim = xr.DataArray(data, coords={"month": np.arange(1, 13, 1)})

# seems to give the correct result if we use the full x, but not with a slice
x_slice = x.sel(time=["2023-04-01"])

# two typical ways of computing anomalies
anom_gb = x_slice.groupby("time.month") - clim
anom_sel = x_slice - clim.sel(month=x_slice.time.dt.month)

# passes on 2023.3.0, fails on 2023.4.2
# the groupby version is aligning the indexes wrong, giving us something other than 0
assert anom_sel.equals(anom_gb)

Related: #7759 #7766
cc @dcherian

What did you expect to happen?

No response

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions