Skip to content

Creating a DataTree should not modify parent and children in-place #9196

Closed
@shoyer

Description

@shoyer

What is your issue?

This violates the well-established Python convention that functions either have a return value or modify arguments in-place. They never do both.

Examples:

  • Modifying a parent:
from xarray.core.datatree import DataTree

root = DataTree()
child = DataTree(name='child', parent=root)
print(root)
# <xarray.DataTree>
# Group: /
# └── Group: /child
  • Modifying children:
from xarray.core.datatree import DataTree

child = DataTree()
root = DataTree(children={'child': child})
print(child)
# <xarray.DataTree 'child'>
# Group: /child

This particularly surprising if a DataTree argument is reused, e.g.,

from xarray.core.datatree import DataTree

child = DataTree()
root = DataTree(children={'child': child})
root2 = DataTree(children={'child2': child})
print(child)  # attached to root2
# <xarray.DataTree 'child2'>
# Group: /child2
print(root)  # now empty!
# <xarray.DataTree>
# Group: /

Here's my suggestion:

  1. We should make the DataTree constructor make shallow copies of its DataTree arguments.
  2. We should consider getting rid of the parent constructor argument. It is redundant with children, and unlike children, parent nodes don't show up in the DataTree repr, so it's more surprising to see them copied.

CC @TomNicholas

Metadata

Metadata

Assignees

No one assigned

    Labels

    API designtopic-DataTreeRelated to the implementation of a DataTree class

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions