Description
What happened?
Currently concatenation will automatically create indexes for any dimension coordinates in the output, even if there were no indexes on the input.
What did you expect to happen?
Indexes not to be created for variables which did not already have them.
Minimal Complete Verifiable Example
# TODO once passing indexes={} directly to DataArray constructor is allowed then no need to create coords object separately first
coords = Coordinates(
{"x": np.array([1, 2, 3])}, indexes={}
)
arrays = [
DataArray(
np.zeros((3, 3)),
dims=["x", "y"],
coords=coords,
)
for _ in range(2)
]
combined = concat(arrays, dim="x")
assert combined.shape == (6, 3)
assert combined.dims == ("x", "y")
# should not have auto-created any indexes
assert combined.indexes == {} # this fails
combined = concat(arrays, dim="z")
assert combined.shape == (2, 3, 3)
assert combined.dims == ("z", "x", "y")
# should not have auto-created any indexes
assert combined.indexes == {} # this also fails
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
# nor have auto-created any indexes
> assert combined.indexes == {}
E AssertionError: assert Indexes:\n x Index([1, 2, 3, 1, 2, 3], dtype='int64', name='x') == {}
E Full diff:
E - {
E - ,
E - }
E + Indexes:
E + x Index([1, 2, 3, 1, 2, 3], dtype='int64', name='x',
E + )
Anything else we need to know?
The culprit is the call to core.indexes.create_default_index_implicit
inside merge.py
. If I comment out this call my concat test passes, but basic tests in test_merge.py
start failing.
I would like know to how to avoid the internal call to create_default_index_implicit
. I tried passing compat='override'
but that made no difference, so I think we would have to change merge.collect_variables_and_indexes
somehow.
Conceptually, I would have thought we should be examining what indexes exist on the objects to be concatenated, and not creating new indexes for any variable that doesn't already have one. Presumably we should therefore be making use of the indexes
argument to merge.collect_variables_and_indexes
, but currently that just seems to be empty.
Environment
I've been experimenting running this test on a branch that includes both #8711 and #8714, but actually this example will fail in the same way on main
.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status