Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Generalize handling of chunked array types (#7019)
* generalise chunk methods to allow cubed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fic typing typo * fixed circular import * fix some mypy errors * added cubed to mypy ignore list * simplify __array_ufunc__ check * Revert "simplify __array_ufunc__ check" as I pushed to wrong branch This reverts commit cdcb3fb. * update cubed array type * fix missed conflict * sketch for ChunkManager adapter class * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove erroneous docstring about usage of map_blocks Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * apply_ufunc -> apply_gufunc Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * chunk -> from_array Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * remove staticmethods * attempt to type methods of ABC * from_array * attempt to specify types * method for checking array type * Update pyproject.toml * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed import errors * generalize .chunk method kwargs * used dask functions in dask chunkmanager * define signatures for apply_gufunc, blockwise, map_blocks * prototype function to detect which parallel backend to use * add cubed.apply_gufunc * ruffify * add rechunk and compute methods for cubed * xr.apply_ufunc now dispatches to chunkmanager.apply_gufunc * CubedManager.chunks * attempt to keep dask and cubed imports lazy * generalize idxmax * move unify_chunks import to ChunkManager * generalize Dataset.load() * check explicitly for chunks attribute instead of hard-coding cubed * better function names * add cubed version of unify_chunks * recognize wrapped duck dask arrays (e.g. pint wrapping dask) * add some tests for fetching ChunkManagers * add from_array_kwargs to open_dataset * add from_array_kwargs to open_zarr * pipe constructors through chunkmanager * generalize map_blocks inside coding * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed full_like * add from_array_kwargs to open_zarr * don't import dask.tokenize * fix bugs with passing from_array_kwargs down * generalise reductions by adding to chunkmanager * moved nanfirst/nanlast to duck_array_ops from dask_array_ops * generalize interp * generalized chunk_hint function inside indexing * DaskIndexingAdapter->ChunkedIndexingAdapter * Revert "DaskIndexingAdapter->ChunkedIndexingAdapter" This reverts commit 4ca044b. * pass cubed-related kwargs down through to_zarr by adding .store to ChunkManager * fix typing_extensions on py3.9 * fix ImportError with cubed array type * give up trying to import TypeAlias in CI * fix import of T_Chunks * fix no_implicit_optional warnings * don't define CubedManager if cubed can't be imported * fix local mypy errors * don't explicitly pass enforce_ndim into dask.array.map_blocks * fix drop_axis default * use indexing adapter on cubed arrays too * use array API-compatible version of astype function * whatsnew * document new kwargs * add chunkmanager entrypoint * move CubedManager to a separate package * guess chunkmanager based on whats available * fix bug with tokenizing * adapt tests to emulate existence of entrypoint * use fixture to setup/teardown dummy entrypoint * refactor to make DaskManager unavailable if dask not installed * typing * move whatsnew to latest xarray version * remove superfluous lines from whatsnew * fix bug where zarr backend attempted to use dask when not installed * Remove rogue print statement Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * Clarify what's new Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * use monkeypatch to mock registering of dummy chunkmanager * more tests for guessing chunkmanager correctly * raise TypeError if no chunkmanager found for array types * Correct is_chunked_array check Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> * vendor dask.array.core.normalize_chunks * add default implementation of rechunk in ABC * remove cubed-specific type check in daskmanager * nanfirst->chunked_nanfirst * revert adding cubed to NON_NUMPY_SUPPORTED_ARRAY_TYPES * licensing to vendor functions from dask * fix bug * ignore mypy error * separate chunk_manager kwarg from from_array_kwargs dict * rename kwarg to chunked_array_type * refactor from_array_kwargs in .chunk ready for deprecation * print statements in test so I can comment on them * remove print statements now I've commented on them in PR * should fix dask naming tests * make dask-specific kwargs explicit in from_array * debugging print statements * Revert "debugging print statements" This reverts commit 7dc6581. * fix gnarly bug with auto-determining chunksizes caused by not referring to dask.config * hopefully fix broken docstring * Revert "make dask-specific kwargs explicit in from_array" This reverts commit 53d6094. * show chunksize limit used in failing tests * move lazy indexing adapter up out of chunkmanager code * try upgrading minimum version of dask * Revert "try upgrading minimum version of dask" This reverts commit 796a577. * un-vendor dask.array.core.normalize_chunks * refactored to all passing ChunkManagerEntrypoint objects directly * Remove redundant Nones from types Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * From future import annotations Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * From functools import annotations Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * From future import annotations Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * defined type for NormalizedChunks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * standardized capitalization of ChunkManagerEntrypoint * ensure ruff doesn't remove import * ignore remaining typing errors stemming from unclear dask typing for chunks arguments * rename store_kwargs->chunkmanager_store_kwargs * missed return value * array API fixes for astype * Revert "array API fixes for astype" This reverts commit 9cd9078. * Apply suggestions from code review * Update xarray/tests/test_parallelcompat.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * overridden -> subclassed Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * ensured all compute calls go through chunkmanager * Raise if multiple chunkmanagers recognize array type Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * from_array_kwargs is optional Co-authored-by: Justus Magin <keewis@users.noreply.github.com> * fixes for chunk methods * correct readme to reflect fact we aren't vendoring dask in this PR any more * update whatsnew * more docstring corrections * remove comment * Raise NotImplementedErrors in all abstract methods Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * type hints for every arg in ChunkManagerEntryPOint methods * more explicit typing + fixes for mypy errors revealed * Keyword-only arguments in full_like etc. Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * None as default instead of {} * fix bug apparently introduced by changing default type of drop_axis kwarg to map_blocks * Removed hopefully-unnecessary mypy ignore Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unnecessary mypy ignores * change default value of drop_axis kwarg in map_blocks and catch when dask version < 2022.9.1 * fix checking of dask version in map_blocks --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com> Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com> Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
- Loading branch information