Ignore missing dims when mapping over tree #67
Description
This tree has a dimension present in some nodes and not others (the "people" dimension).
DataTree('root', parent=None)
│ Dimensions: (people: 2)
│ Coordinates:
│ * people (people) <U5 'alice' 'bob'
│ species <U5 'human'
│ Data variables:
│ heights (people) float64 1.57 1.82
└── DataTree('simulation')
├── DataTree('coarse')
│ Dimensions: (x: 2, y: 3)
│ Coordinates:
│ * x (x) int64 10 20
│ Dimensions without coordinates: y
│ Data variables:
│ foo (x, y) float64 0.1242 -0.2324 0.2469 0.5168 0.8391 0.8686
│ bar (x) int64 1 2
│ baz float64 3.142
└── DataTree('fine')
Dimensions: (x: 6, y: 3)
Coordinates:
* x (x) int64 10 12 14 16 18 20
Dimensions without coordinates: y
Data variables:
foo (x, y) float64 0.1242 -0.2324 0.2469 ... 0.5168 0.8391 0.8686
bar (x) float64 1.0 1.2 1.4 1.6 1.8 2.0
baz float64 3.142
If a user calls dt.mean(dim='people')
, then at the moment this will raise an error. That's because it maps the .mean
call over each group, and when it gets to either the 'coarse'
group or the 'fine'
group it will not find a dimension called 'people'
.
However the user might want to take the mean of groups only where this makes sense, and ignore the rest.
I think the best solution is to have a missing_dims
argument, like xarray's .isel
already has. Then the user can do dt.mean(dim='people', missing_dims='ignore')
.
To actually implement this I think only requires changes in xarray, not here, because those changes should propagate down to datatree. pydata/xarray#5030