-
-
Notifications
You must be signed in to change notification settings - Fork 747
Description
As of #7586, the sum of Worker.digests_total[("execute", *, *, "seconds")] is equal to the time spent executing tasks, multiplied by the number of threads on the worker.
There's a big chunk of extra time that is not counted, which is:
- Time the worker spent with idle threads because it was paused.
Worker.digests_total["execute", "n/a", "paused", "seconds"] = min(
number of threads - number of tasks in executing/cancelled/resumed state,
number of tasks in ready or constrained state,
) * T
where T is the time between changes in any of the variables in the formula.
Note that we would not record the task prefix here, unlike in other execute digests.
Note that a worker may be paused and not accrue any paused time, because there are tasks still running.
- Time the worker spent with idle threads because of resource limits
Worker.digests_total["execute", "n/a", "constrained", "seconds"] = max(0, min(
number of threads - number of tasks in executing/cancelled/resumed state,
number of tasks in constrained state
) * T - paused time)
- Time the worker spent with idle threads because it was fetching data. This is different from the sum of
Worker.digests_total[("gather-dep", *, "seconds")]as it should exclude the time where dependency gathering and execution where properly pipelined. In other words, this time should be defined as
Worker.digests_total["execute", "n/a", "gather-dep", "seconds"] = max(0, min(
number of threads - number of tasks in executing/cancelled/resumed state,
number of tasks in waiting state,
) * T
- paused time
- constrained time
)
- time the worker spent with idle threads because it was waiting for more content from the scheduler.
This should be defined as
Worker.digests_total["execute", "n/a", "idle", "seconds"] = max(0,
(number of threads - number of tasks in executing/cancelled/resumed state) * T
- paused time
- constrained time
- gather time
))
With the above additions, the sum of Worker.digests_total[("execute", *, *, "seconds")] should accumulate to (number of threads * worker uptime) by construction when there are no tasks currently running.
The above formulas are a very quick draft and should be reviewed for correctness.