Add dataset methods at class definition time rather than object instantiation time? #18
Description
Currently I'm adding the xarray.Dataset
methods to DataTree
via a pattern basically like this:
_DATASET_API_TO_COPY = ['isel', '__add__', ...]
class DatasetAPIMixin:
def _add_dataset_api(self):
for method_name in _DATASET_API_TO_COPY:
ds_method = getattr(xarray.Dataset, method_name)
# Decorate method so that when called it acts over whole subtree
mapped_method = map_over_subtree(ds_method)
setattr(self, method_name, mapped_method)
class DataTree(DatasetAPIMixin):
def __init__(self, *args):
self._add_dataset_api()
The idea was that the use of Mixins would echo how these methods were defined on xarray.Dataset
originally, and also keep a distinction between methods that are actually unique to DataTree
objects (such as .groups
), and methods that are merely copied over from xarray.Dataset
like .isel
(albeit with modifications such as mapping over child nodes).
I like my Mixin idea, but one weird thing about this pattern is that the Dataset methods are only added to the DataTree
once a dt
instance is instantiated, not when the DataTree
class is defined. I don't know if this is likely to cause problems, but at the very least it seems inefficient, because we are running the code to loop through and attach all these methods every single time we create a new DataTree
object. It's also not really an example of class inheritance right now - the mixins aren't actually doing anything other than being a different place for me to put the definition of _add_dataset_api()
.
What would be better would be if the dataset methods were actually added at class definition time rather than object instantiation time, and ideally fully defined on the mixin before it is inherited. Then we wouldn't need to call any _add_dataset_api()
method on the dt
instance because the methods would already be there.
The only way I can think of to actually to do this within the class definitions is using a metaclass.
I could also possibly set the attribute outside of the mixin definition but before the definition of DataTree
like this:
class DatasetAPIMixin:
pass
for method_name in _DATASET_API_TO_COPY:
ds_method = getattr(xarray.Dataset, method_name)
# Decorate method so that when called it acts over whole subtree
mapped_method = map_over_subtree(ds_method)
setattr(DatasetAPIMixin, method_name, mapped_method)
class DataTree(DatasetAPIMixin):
...