Closed
Description
What is your issue?
The in-place modification issue described in #9196 was fixed for DataTree
, but actually still exists for its private base class TreeNode
.
In [1]: from xarray.core.treenode import TreeNode
...:
...: child = TreeNode()
...: root = TreeNode(children={'child': child})
...: print(child.parent)
TreeNode(children={'child': TreeNode(children={})})
(The issue here being that the constructor has returned a new object root
but also modified the object child
in-place.)
We do test TreeNode
directly (in test_treenode.py
), which I think is useful as it internally separates tree structure operations from data manipulation operations. But if you copy the new test added in #9196 to treenode.py
it will currently fail.
Trying to fix this reveals a rabbit hole of ways in which the DataTree
/TreeNode
relationship should be refactored:
DataTree.__init__
should callTreeNode.__init__
to assignchildren
,TreeNode.__init__
should.copy()
children (i.e. move the solution to Creating a DataTree should not modify parent and children in-place #9196 down intoTreeNode
),- This requires
.copy()
to be defined onTreeNode
rather than onDataTree
, with only._copy_node
overridden to also copy.data
, - That requires
._copy_subtree()
to use only methods available to theTreeNode
class, to iterate over the subtree efficiently, - That might require using some methods that are currently only defined on the
NamedNode
class, particularly.relative_to
(not totally sure about that yet), .relative_to
requires knowing the node.name
, which implies perhaps we should mergeTreeNode
andNamedNode
(which was suggested previously by @shoyer anyway).