-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make subclassing easier? #3980
Comments
Thanks @TomNicholas for opening up this issue. Just cross referencing one more issue: #1097 |
There is also #3582 where the recommendation was to wrap instead of subclassing. |
Still relevant |
Does xarray have anything like NumPy's dispatch mechanism? This would make it relatively easy to encapsulate domain-specific logic inside a wrapper class, rather than a subclass or an accessor namespace. |
Yes, xarray fully supports this. |
I think we have made good progress to better support this, only requirement is that the Maybe in the future we can relax this as well and use a shortcut internally. The only thing which probably won't work anytime soon is custom Datasets returning custom DataArrays. |
@pydata/xarray should this feature be added to our development roadmap? It's arguably another approach to making more flexible data structures... |
Following this very long discussion on propagating grid information with Xarray objects, this group wants to subclass and attach unstructured grid information to their derived classes. I invited them to the meeting tomorrow to discuss a public subclassing interface as proposed here. Come with opinions! |
I will not be able to join tomorrow, so here are my thoughts on this topic:
|
Suggestion
We relatively regularly have users asking about subclassing
DataArray
andDataset
, and I know of at least a few cases where people have gone through with it. However we currently explicitly discourage doing this, on the basis that basically all operations will return a bare xarray object instead of the subclassed version, it's full of trip hazards, and we have the accessor interface to point people to instead.However, while useful, the accessors aren't enough for some users, and I think we could probably do better. If we refactored internally we might be able to make it much easier to subclass.
Example to follow in Pandas
Pandas takes an interesting approach: while they also explicitly discourage subclassing, they still try to make it easier, and show you what you need to do in order for it to work.
They ask you to override some constructor properties with your own, and allow you to define your own original properties.
Potential complications
.construct_dataarray
andDataArray.__init__
are used a lot internally to reconstruct a DataArray fromdims
,coords
,data
etc. before returning the result of a method call. We would probably need to standardise this, before allowing users to override it.Pandas actually has multiple constructor properties you need to override:
_constructor
,_constructor_sliced
, and_constructor_expanddim
. What's the minimum set of similar constructors we would need?Blocking access to attributes - we current stop people from adding their own attributes quite aggressively, so that we can have attributes as an alias for variables and attrs, we would need to either relax this or better allow users to set a list of their own
_properties
which they want to register, similar to pandas.__slots__
- I think something funky can happen if you inherit from a class that defines__slots__
?Documentation
I think if we do this we should also slightly refactor the relevant docs to make clear the distinction between 3 groups of people:
@max-sixty you had some ideas about what would need to be done for this to work?
The text was updated successfully, but these errors were encountered: