-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DataTree.from_dict
to be insensitive to insertion order
#9292
Fix DataTree.from_dict
to be insensitive to insertion order
#9292
Conversation
xarray/core/datatree.py
Outdated
@@ -1104,7 +1104,8 @@ def from_dict( | |||
|
|||
if d: | |||
# Populate tree with children determined from data_objects mapping | |||
for path, data in d.items(): | |||
# Sort keys so as to insert nodes from root first (see GH issue #9276) | |||
for path, data in sorted(d.items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomNicholas What if the user does want something like the following?
reversed = DataTree.from_dict(
{
"/": xr.Dataset({"age": 83}),
"/C/Chloe": xr.Dataset({"age": 10}),
"/A/Abe": xr.Dataset({"age": 39}),
}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah hmm good question. Well this PR will succeed, but create a tree with the children in a different order:
<xarray.DataTree>
Group: /
│ Dimensions: ()
│ Data variables:
│ age int64 8B 83
├── Group: /A
│ └── Group: /A/Abe
│ Dimensions: ()
│ Data variables:
│ age int64 8B 39
└── Group: /C
└── Group: /C/Chloe
Dimensions: ()
Data variables:
age int64 8B 10
Whether or not this is okay depends on whether or not we consider the children to be ordered. Internally they are ordered, which this fix would break I guess. But do we want to guarantee preservation of that order to users? I think it should behave the same way as whether or not we guarantee preservation of the order of .variables
in Dataset
.
cc @shoyer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keewis suggested that if we do a sortby depth then we would eliminate the bug without re-ordering the nodes within a group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically, we sort by the number of /
characters in the (normalized) paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now, and tested. What a neat idea @keewis !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful! 💯
|
||
# Check that Bart and Lisa's order is still preserved within the group, | ||
# despite 'Bart' coming before 'Lisa' when sorted alphabetically | ||
assert list(reversed["Homer"].children.keys()) == ["Lisa", "Bart"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion fails unless sorted
sorts by depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to think on this, but I like the way it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tracks, I like the depth order nice touch.
The test failure is an unrelated flaky dask test so now I have an approval I will merge. |
whats-new.rst
New functions/methods are listed inapi.rst