Closed
Description
Profiling the scheduler during a benchmark workload on a 128-worker (2cpu each) cluster, I noticed total_occupancy
taking 32% of total scheduler runtime!
Profile (go to left-heavy view):
Subjectively, the dashboard was also extremely laggy. (This is real time, not an artifact of the screen recording or network delay. Go to the end to see the dashboard becomes responsive once most tasks are done.)