Skip to content

Distribute the scheduler #165

Open
Open
@jpsamaroo

Description

@jpsamaroo

We're already pushing some extra work onto the worker nodes (mainly argument fetching and processor selection/load balancing), and it would be beneficial for large DAGs to move more work onto each worker. The main blocker is providing a way to split the DAG into multiple domains, where each domain is handled by a given thread on a given worker. With efficient Thunk serialization, we can then send a subgraph to each worker and let them process their own DAG without conflicts. We'll need to add a mechanism by which thunks automatically wait on their input thunks to complete before they attempt to download the output data; if possible, we can also have workers broadcast and shard their Chunks onto dependent workers as soon as the data is made available.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions