Skip to content

datatree gets dis-aligned in binary op #10013

Open
@mathause

Description

@mathause

What happened?

Subtracting a dataset from a datatree can cause a ValueError: group '/a' is not aligned with its parents even though the parent dt is valid even though the resulting nodes are only siblings.

What did you expect to happen?

This to work. (I don't yet understand the design decision why the parents have to be aligned. I think together with the nodes that cannot be empty this leads to problems.)

Minimal Complete Verifiable Example

import xarray as xr

dt = xr.DataTree()

a = xr.Dataset(data_vars={"x": [10, 20]}, coords={"time": [0, 1]})
b = xr.Dataset(data_vars={"x": [11, 22, 33]}, coords={"time": [0, 1, 2]})

dt["a"] = a
dt["b"] = b

dt - b # fails

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:149, in check_alignment(path, node_ds, parent_ds, children)
    148 try:
--> 149     align(node_ds, parent_ds, join="exact", copy=False)
    150 except ValueError as e:

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/alignment.py:883, in align(join, copy, indexes, exclude, fill_value, *objects)
    875 aligner = Aligner(
    876     objects,
    877     join=join,
   (...)
    881     fill_value=fill_value,
    882 )
--> 883 aligner.align()
    884 return aligner.results

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/alignment.py:575, in Aligner.align(self)
    574 self.assert_no_index_conflict()
--> 575 self.align_indexes()
    576 self.assert_unindexed_dim_sizes_equal()

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/alignment.py:422, in Aligner.align_indexes(self)
    421 if self.join == "exact":
--> 422     raise ValueError(
    423         "cannot align objects with join='exact' where "
    424         "index/labels/sizes are not equal along "
    425         "these coordinates (dimensions): "
    426         + ", ".join(f"{name!r} {dims!r}" for name, dims in key[0])
    427     )
    428 joiner = self._get_index_joiner(index_cls)

ValueError: cannot align objects with join='exact' where index/labels/sizes are not equal along these coordinates (dimensions): 'x' ('x',)

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[18], line 11
      8 dt["a"] = a
      9 dt["b"] = b
---> 11 dt - b # fails

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/_typed_ops.py:40, in DataTreeOpsMixin.__sub__(self, other)
     39 def __sub__(self, other: DtCompatible) -> Self:
---> 40     return self._binary_op(other, operator.sub)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:1548, in DataTree._binary_op(self, other, f, reflexive, join)
   1540     return NotImplemented
   1542 ds_binop = functools.partial(
   1543     Dataset._binary_op,
   1544     f=f,
   1545     reflexive=reflexive,
   1546     join=join,
   1547 )
-> 1548 return map_over_datasets(ds_binop, self, other)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree_mapping.py:111, in map_over_datasets(func, *args)
    108 if num_return_values is None:
    109     # one return value
    110     out_data = cast(Mapping[str, Dataset | None], out_data_objects)
--> 111     return DataTree.from_dict(out_data, name=name)
    113 # multiple return values
    114 out_data_tuples = cast(Mapping[str, tuple[Dataset | None, ...]], out_data_objects)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:1198, in DataTree.from_dict(cls, d, name)
   1196         else:
   1197             raise TypeError(f"invalid values: {data}")
-> 1198         obj._set_item(
   1199             path,
   1200             new_node,
   1201             allow_overwrite=False,
   1202             new_nodes_along_path=True,
   1203         )
   1205 # TODO: figure out why mypy is raising an error here, likely something
   1206 # to do with the return type of Dataset.copy()
   1207 return obj

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/treenode.py:652, in TreeNode._set_item(self, path, item, new_nodes_along_path, allow_overwrite)
    650         raise KeyError(f"Already a node object at path {path}")
    651 else:
--> 652     current_node._set(name, item)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:944, in DataTree._set(self, key, val)
    942     new_node = val.copy(deep=False)
    943     new_node.name = key
--> 944     new_node._set_parent(new_parent=self, child_name=key)
    945 else:
    946     if not isinstance(val, DataArray | Variable):
    947         # accommodate other types that can be coerced into Variables

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/treenode.py:115, in TreeNode._set_parent(self, new_parent, child_name)
    113 self._check_loop(new_parent)
    114 self._detach(old_parent)
--> 115 self._attach(new_parent, child_name)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/treenode.py:152, in TreeNode._attach(self, parent, child_name)
    147 if child_name is None:
    148     raise ValueError(
    149         "To directly set parent, child needs a name, but child is unnamed"
    150     )
--> 152 self._pre_attach(parent, child_name)
    153 parentchildren = parent._children
    154 assert not any(
    155     child is self for child in parentchildren
    156 ), "Tree is corrupt."

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:528, in DataTree._pre_attach(self, parent, name)
    526 node_ds = self.to_dataset(inherit=False)
    527 parent_ds = parent._to_dataset_view(rebuild_dims=False, inherit=True)
--> 528 check_alignment(path, node_ds, parent_ds, self.children)
    529 _deduplicate_inherited_coordinates(self, parent)

File ~/.conda/envs/mesmer-tests/lib/python3.13/site-packages/xarray/core/datatree.py:153, in check_alignment(path, node_ds, parent_ds, children)
    151         node_repr = _indented(_without_header(repr(node_ds)))
    152         parent_repr = _indented(dims_and_coords_repr(parent_ds))
--> 153         raise ValueError(
    154             f"group {path!r} is not aligned with its parents:\n"
    155             f"Group:\n{node_repr}\nFrom parents:\n{parent_repr}"
    156         ) from e
    158 if children:
    159     if parent_ds is not None:

ValueError: group '/a' is not aligned with its parents:
Group:
    Dimensions:  (x: 0, time: 2)
    Coordinates:
      * x        (x) int64 0B 
      * time     (time) int64 16B 0 1
    Data variables:
        *empty*
From parents:
    Dimensions:  (x: 3, time: 3)
    Coordinates:
      * x        (x) int64 24B 11 22 33
      * time     (time) int64 24B 0 1 2

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.13.1 | packaged by conda-forge | (main, Jan 13 2025, 09:53:10) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 6.8.0-52-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.4
libnetcdf: 4.9.2

xarray: 2025.1.2.dev26+gd7ac79a3
pandas: 3.0.0.dev0+1875.gc36da3f6de
numpy: 2.3.0.dev0+git20250128.a1fa8e1
scipy: 1.16.0.dev0+git20250128.1cbdb24
netCDF4: 1.7.2
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: 1.6.4
nc_time_axis: 1.3.1.dev218
iris: None
bottleneck: None
dask: 2025.1.0+9.gc08f8e50
distributed: 2025.1.0
matplotlib: 3.11.0.dev400+g71f5cf3a07
cartopy: 0.24.0
seaborn: None
numbagg: None
fsspec: 2024.12.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.8.0
pip: 25.0
conda: None
pytest: 8.3.4
mypy: None
IPython: 8.31.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugtopic-DataTreeRelated to the implementation of a DataTree class

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions