Skip to content

Support for pod schedulers other than schedulerName: default-scheduler #233

@scottyhq

Description

@scottyhq

Currently dask worker pods are spread onto available nodes by the default kubernetes scheduler:

[ec2-user@ip-192-168-60-131 ~]$ kubectl get pod -o yaml dask-cgentemann-osm2020tutorial-nqchvhmy-6e9099fc-3k2s6c -n binder-staging | grep schedule
  schedulerName: default-scheduler

This can lead to scale-down issues with multiple users launching clusters or when pods encounter errors because pods by default spread out on available nodes. For example, we recently observed an issue were many dask pods had an Error status, leading to new nodes being launched to meet capacity. We ended up with 17 nodes running with two dask pods per node instead of packing all pods onto 5 nodes.

JupyterHub deals with this same scenario by packing user-notebook pods onto nodes with a custom userScheduler:
https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/optimization.html#using-available-nodes-efficiently-the-user-scheduler

@yuvipanda suggested a possible solution is simply reusing the jupyter scheduler in dask kubernetes config. Some additional relevant docs here:
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/#specify-schedulers-for-pods

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions