Skip to content

Formalize contract between XArray and the dask.distributed scheduler #1644

Closed
@jhamman

Description

@jhamman

From @mrocklin in pangeo-data/pangeo#5 (comment):

XArray was designed long before the dask.distributed task scheduler. As a result newer ways of doing things, like asynchronous computing, persist, etc. either don't function well, or were hacked on in a less-than-optimal-way. We should improve this relationship so that XArray can take advantage of newer dask.distributed features today and also adhere to contracts so that it benefits from changes in the future.

There is conversation towards the end of dask/dask#1068 about what such a contract might look like. I think that @jcrist is planning to work on this on the Dask side some time in the next week or two.

There is a new "Dask Collection Interface" implemented in dask/dask#2748 (and the dask docs docs).

I'm creating this issue here (in addition to pangeo-data/pangeo#5) to track design considerations on the xarray side and to get input from the @pydata/xarray team.

cc @mrocklin, @shoyer, @jcrist, @rabernat

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions