-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate scheduling API from dask implementation #30
Merged
Commits on Jul 24, 2020
-
Separate chunking API from scheduling
xref pangeo-data#29 This PR moves the dask specific scheduling logic into a separate `dask.py` file, as a first step for adding support for alternative schedulers. (I'm particularly interested in supporting Apache Beam.) The existing tests pass (with minor modifications), but the documentation still needs updating. Notes: - I put `staged_copy` into a single function, but perhaps there are other generic methods (`execute`?) that would justify using a class? - `Rechunked` no longer inherits from `dask.delayed.Delayed`, and no longer has any dask specific logic at all. I think this is important for generic scheduler support, but it does means make it a little less reusable in larger pipelines. `_delayed` is currently a private attribute, but we should probably expose the scheduler equivalent of "delayed" objects in some way. I guess this is a use-case for class-based interface from the previous bullet. - `Rechunked` now always contains zarr arrays/groups rather than dask arrays. This makes the repr a little less informative, e.g., it no longer shows chunk size. This should probably be fixed before merging. - Will "two stage" copying always suffice? The interface I wrote for `staged_copy` supports any number of stages (in theory). That might be useful in the future, or it might be unnecessary complexity. - To verify that adding a new scheduler is not too painful, I should probably write at least a second example. I'll start with a naive "reference" scheduler in pure Python (this could go in the docs) and think about adding a Beam implementation as well. Beam is perhaps a nice example because it's execution models is so different from dask (based on higher level transforms like "map" rather than individual tasks).
Configuration menu - View commit details
-
Copy full SHA for 7aa8308 - Browse repository at this point
Copy the full SHA 7aa8308View commit details -
Configuration menu - View commit details
-
Copy full SHA for 720fe00 - Browse repository at this point
Copy the full SHA 720fe00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8adf7b1 - Browse repository at this point
Copy the full SHA 8adf7b1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 10cd406 - Browse repository at this point
Copy the full SHA 10cd406View commit details -
Configuration menu - View commit details
-
Copy full SHA for 16d1fa8 - Browse repository at this point
Copy the full SHA 16d1fa8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 25167fe - Browse repository at this point
Copy the full SHA 25167feView commit details
Commits on Jul 25, 2020
-
Configuration menu - View commit details
-
Copy full SHA for b6b4a52 - Browse repository at this point
Copy the full SHA b6b4a52View commit details -
Configuration menu - View commit details
-
Copy full SHA for ff37794 - Browse repository at this point
Copy the full SHA ff37794View commit details
Commits on Jul 27, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 34960c1 - Browse repository at this point
Copy the full SHA 34960c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 24e9ba5 - Browse repository at this point
Copy the full SHA 24e9ba5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2c9c3bf - Browse repository at this point
Copy the full SHA 2c9c3bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6732366 - Browse repository at this point
Copy the full SHA 6732366View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.