-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with dask 2021.02.0 #4884
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. I don't know a lot about dask
, though, so someone else might need to look at this.
xarray/core/dataset.py
Outdated
# TODO We're wasting a lot of key-level work. We should write a fast | ||
# variant of HighLevelGraph.cull() that works at layer level | ||
# only. | ||
dsk2 = dsk.cull(keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add tests for this code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will once dask/dask#7203 is out. I can remove the code path for now if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe that's safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the HLG and reworked the whole thing
* upstream/master: (24 commits) Compatibility with dask 2021.02.0 (pydata#4884) Ensure maximum accuracy when encoding and decoding cftime.datetime values (pydata#4758) Fix `bounds_error=True` ignored with 1D interpolation (pydata#4855) add a drop_conflicts strategy for merging attrs (pydata#4827) update pre-commit hooks (mypy) (pydata#4883) ensure warnings cannot become errors in assert_ (pydata#4864) update pre-commit hooks (pydata#4874) small fixes for the docstrings of swap_dims and integrate (pydata#4867) Modify _encode_datetime_with_cftime for compatibility with cftime > 1.4.0 (pydata#4871) vélin (pydata#4872) don't skip the doctests CI (pydata#4869) fix da.pad example for numpy 1.20 (pydata#4865) temporarily pin dask (pydata#4873) Add units if "unit" is in the attrs. (pydata#4850) speed up the repr for big MultiIndex objects (pydata#4846) dim -> coord in DataArray.integrate (pydata#3993) WIP: backend interface, now it uses subclassing (pydata#4836) weighted: small improvements (pydata#4818) Update related-projects.rst (pydata#4844) iris update doc url (pydata#4845) ...
* upstream/master: FIX: h5py>=3 string decoding (pydata#4893) Update matplotlib's canonical (pydata#4919) Adding vectorized indexing docs (pydata#4711) Allow fsspec URLs in open_(mf)dataset (pydata#4823) Fix typos in example notebooks (pydata#4908) pre-commit autoupdate CI (pydata#4906) replace the ci-trigger action with a external one (pydata#4905) Update area_weighted_temperature.ipynb (pydata#4903) hide the decorator from the test traceback (pydata#4900) Sort backends (pydata#4886) Compatibility with dask 2021.02.0 (pydata#4884)
Closes #4860
Reverts #4873
Restore compatibility with dask 2021.02.0 by avoiding improper assumptions on the implementation details of
da.Array.__dask_postpersist__()
.This PR does not align xarray to the new dask collection spec (dask/dask#7093), as I just realized that Datasets violate the rule of having all dask keys with the same name if they contain more than one dask variable - and cannot do otherwise. So I have to change the dask collection spec again to accommodate them.