Open
Description
I'm looking at a computation right now where the scheduler is engaging work stealing between workers that have more work to workers that have less work. This makes a lot of sense, except for the fact that all of the workers have a ton of work.
We probably should care less about balancing occupancy as the ratio of occupancy to average task duration is high. It's going to be a while before we need to care about load balancing, and things may change by the time that that happens. Right now small variances in occupancy (say, 10%) are causing the cluster to move around a lot of data, which is probably unnecessary.