Open
Description
openedon Oct 27, 2022
Problem
If you have spawned a lot of tasks competing for CPU cores, it's difficult to tell how much CPU time one has actually gotten over its lifetime, since @time
measures wall clock time.
What we want is something equivalent to the CPU Time counter in the OS-level activity monitor, but for individual tasks:
Proposed Solution
Add two counters to the Task struct: (forgive the crude pseudo code)
struct task {
…
+ last_scheduled_at timestamp;
+ cpu_time_total_ns int64; // initialize to 0
}
And then use them in the scheduler:
when taking task T off the run queue and putting it on a CPU {
…
+ T.last_scheduled_at = now()
}
…
when taking task T off the CPU due to it blocking on something {
…
+ T.cpu_time_total_ns += now() - T.last_scheduled_at
}
The counter could then be read from Julia "userspace":
println("my task got $(t.cpu_time_total_ns) ns of CPU time")
…and then fed into other systems for logging / metrics, etc.
Alternatives
- @kpamnany's bpftrace infrastructure (see Runtime event trace visualization #36870) emits task switch events which could be used to derive this same information. It's just a lot harder to use, since you have to be on a machine that supports BPF, then set up something which is consuming the events. Making this info available directly in Julia userspace seems easier.
Task heritability
You might want to know how much time child tasks consumed as well. But I think this could be a problem left to Julia userspace, since it's possible to keep a tree of tasks and read these numbers off of them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment