-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for subclasses: tests, docs and API #1097
Comments
Agreed. Just purely for information, I made a very simple subclass, but then a simple print statement didn't work: Traceback (most recent call last): The last statement basically called my subclass, instead of DataArray, but my subclass didn't work well with that because it restricted data to be two-dimensional, but the library code required a dimension of 1 at this point. |
Do we have examples of such subclasses? It would be interesting to document the cases where the accessors are not good enough. |
Initially, I have tried this:
However, I just know discovered the accessors and will have a look. |
Just to add a puzzling attempt to subclass Are we doing it wrong? Is there documentation that shows how to do something this dumb "the right way"? Would be great to have that (which is essentially a 👍 for this issue). |
I think we should be able improve subclass support. (We use it internally at times, mainly for encapsulating logic on specific objects we inherit from @arokem if you're keen to investigate further, would be interesting to know what's happening there. It's possible it's an issue with |
Yeah: I'd be happy to dig around a bit. Where should I look? Also, @mbeyeler might be able to share a bit more about sub-classing woes. I think that he made a commendable |
For that case, you could put a breakpoint in and see what's calling it. It is bemusing For subclass support, you could see whether there are methods that return I think we're in an equilibrium where subclassing isn't supported for most operations, so it's not used, so we don't hear about it's failures. A moderate push could move us out of that equilibrium! |
The biggest problem is with all the Dataset methods and accessors that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them. May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural. |
There's also a funny, pickle-friendly hack that allows you to add methods to a Dataset without subclassing it - thanks to the
|
There's also the argument that I would love, at some point, to migrate the xarray objects to use |
Nevermind |
I had thought the primary saving was memory (and fairly significant with lots of objects) |
It is one of the two savings - the other theoretically being attribute access time. I think I'll go on with a pull request for further discussion. |
This is correct - functions which convert between There's a bigger piece of work which would solve this too, at the cost of abstraction: have class attributes which define
Right, good question and we should catch ourselves from adding every abstraction. I have one specific use-case we've already found helpful: we have an object that is mostly a Dataset, with the addition of some behaviors and constructors - for example
We don't use accessors because the behaviors are specific to a class, rather than every xarray object. |
Given that people do currently subclass xarray objects, it's worth considering making a subclass API like pandas:
http://pandas.pydata.org/pandas-docs/stable/internals.html#subclassing-pandas-data-structures
At the very least, it would be nice to have docs that describe how/when it's safe to subclass, and tests that verify our support for such subclasses.
The text was updated successfully, but these errors were encountered: