Skip to content

Coordinate promotion workaround broken #6607

Closed
@aulemahal

Description

@aulemahal

What happened?

Ok so this one is a bit weird. I'm not sure this is a bug, but code that worked before doesn't anymore, so it is some sort of regression.

I have a dataset with one dimension and one coordinate along that one, but they have different names. I want to transform this so that the coordinate name becomes the dimension name so it becomes are proper dimension-coordinate (I don't know how to call it). After renaming the dim to the coord's name, it all looks good in the repr, but the coord still is missing an index for that dimension (crd.indexes is empty, see MCVE). There was a workaround through reset_coords for this, but it doesn't work anymore.

Instead, the last line of the MCVE downgrades the variable, the final lon doesn't have coords anymore.

What did you expect to happen?

In the MCVE below, I show what the old "workaround" was. I expected lon.indexes to contain the indexes lon at the end of the procedure.

Minimal Complete Verifiable Example

import xarray as xr

# A dataset with a 1d variable along a dimension
ds = xr.Dataset({'lon': xr.DataArray([1, 2, 3], dims=('x',))})

# Promote to coord. This still is not a proper crd-dim (different name)
ds = ds.set_coords(['lon'])

# Rename dim:
ds = ds.rename(x='lon')

# Now do we have a proper coord-dim ? No. not yet because:
ds.indexes # is empty

# Workaround that was used up to the last release
lon = ds.lon.reset_coords(drop=True)

# Because of the missing indexes the next line fails on the master
lon - lon.diff('lon')

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

My guess is that this line is causing reset_coords to drop the coordinate from itself :

names = set(self.coords) - set(self._indexes)

It would be nice if the renaming was sufficient for the indexes to appear.

My example is weird I know. The real use case is a script where we receive a 2d coordinate but where all lines are the same, so we take the first line and promote it to a proper coord-dim. But the current code fails on the master on the lon - lon.diff('lon') step that happens afterwards.

Environment

INSTALLED VERSIONS

commit: None
python: 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:22:55)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.13.19-2-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_CA.UTF-8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2022.3.1.dev104+gc34ef8a6
pandas: 1.4.2
numpy: 1.22.2
scipy: 1.8.0
netCDF4: None
pydap: installed
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.02.1
distributed: 2022.2.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.3.0
cupy: None
pint: None
sparse: 0.13.0
setuptools: 59.8.0
pip: 22.0.3
conda: None
pytest: 7.0.1
IPython: 8.3.0
sphinx: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions