-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration with Dask (add tests; implement the Dask collection interface on Quantity) #883
Comments
The only caveat I see is that xarray wraps specifically around a dask.array, not a generic dask collection. The obvious example of another dask collection that should be treated differently is So |
Though, if I'm missing something that makes a combined (This also presumes having a recent enough dask version to ensure |
Today, yes. But I would expect that in the future DataFrame objects (both pandas and dask) will start defining |
That's fine - if you want to bask in the light of NEP18, you'll need very recent versions of your whole numeric stack anyway. |
963: Add tests and documentation with improvement of downcast type compatibility (part of #845) r=hgrecco a=jthielen As a part of #845, this PR adds tests for downcast type compatibility with Sparse's `COO` and NumPy's `MaskedArray`, along with more careful handling of downcast types throughout the library. Also included is new documentation on array type compatibility, including the type casting hierarchy digraph by @shoyer and @crusaderky. While this PR doesn't fully bring Pint's downcast type compatibility to a completed state, I think this gets it "good enough" for the upcoming release, and the remaining issues are fairly well defined: - MaskedArray non-commutativity (#633 / numpy/numpy#15200) - Dask compatibility (#883) - Addition of CuPy tests (no issue on issue tracker yet) Because of that, I think this can close #845, but if @hgrecco you want that kept open until the above items are resolved, let me know. - [x] Closes #37; Closes #845 - [x] Executed ``black -t py36 . && isort -rc . && flake8`` with no errors - [x] The change is fully covered by automated unit tests - [x] Documented in docs/ as appropriate - [x] Added an entry to the CHANGES file Co-authored-by: Jon Thielen <github@jont.cc>
With #845 / #963 deferring on tests with Dask, I've updated this issue to also cover adding tests with Dask (in part, due to the hold up of dask/dask#4583). I'd be glad to continue working on it, but since I likely won't get chance to do so until later in January, if someone else wanted to take this on in the mean time, feel free! In any case, hopefully this can be something included in the release after the upcoming one (so Pint 0.11)? |
Sorry I have to keep putting this off, but given some more urgent projects that have come up, I do not think I will get the chance to work on this again until April. So, if 0.12 is due out in mid-to-late March, this may need to be bumped again from 0.12 to 0.13, or just un-milestoned until I or someone else is able to work on this. |
Based on #878 and pydata/xarray#525, it would be helpful for interoperability between xarray, pint, and dask for pint to implement the dask collection interface for when a pint Quantity wraps a dask array. This should allow a Quantity-wrapped dask array to still behave in a dask-array-like way (i.e., as a "duck dask array"). There could also be convenience methods like
compute()
,persist()
, andchunk()
, following xarray's example.Implementation of this could likely follow or come along with changes discussed in #878 and #845. Based on @hgrecco's comment (#878 (comment)), I would guess that this would also all be following a decision being made about #875 and #764 to know when this should be implemented.
Also, ping @crusaderky, since you've been working a lot with xarray, pint, and dask together, and I'd want to hear your thoughts on this.
The text was updated successfully, but these errors were encountered: