Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-Task cpu time metric #56320

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
52b5080
basic task cpu time tracking
vilterp Jan 26, 2023
1f769ce
initialize to zero
vilterp Jan 26, 2023
46bc184
make new task field visible from Julia
vilterp Jan 26, 2023
f7dfd0f
add start time
vilterp Jan 26, 2023
78849e5
fix typo
vilterp Jan 26, 2023
70482c1
add test
vilterp Jan 26, 2023
7f59ae4
add a wait
vilterp Jan 26, 2023
f30d331
import `LinearAlgebra` for `peakflops`
vilterp Jan 27, 2023
007777b
update last scheduled on task finish as well
vilterp Jan 31, 2023
d8e829f
no need to update last scheduled in finish
vilterp Feb 3, 2023
f23145c
add test
vilterp Jun 23, 2023
4ff9e83
fix test
vilterp Jun 23, 2023
f545146
Add per-task wall-time
nickrobinson251 Oct 16, 2024
00fbb08
Test individual task cpu_time less than wall_time
nickrobinson251 Oct 17, 2024
c037506
More tests
nickrobinson251 Oct 17, 2024
4ef3023
Move task timings mostly to julia
nickrobinson251 Oct 21, 2024
c51bf70
Record cpu time on entrance to `wait/yield`
nickrobinson251 Oct 23, 2024
5a7fbfe
Rename 'scheduled_at' -> '[en/de]queued_at'
nickrobinson251 Oct 24, 2024
6af1a2a
Ability to enable/disable task timings globally
nickrobinson251 Oct 24, 2024
2077fac
fixup whitespace
nickrobinson251 Oct 24, 2024
36a5f7b
Enable task timing in test
nickrobinson251 Oct 24, 2024
242cc7f
Enable task timing for root task
nickrobinson251 Oct 28, 2024
4a678a3
Prevent `task_timing(false)` decrementing below zero
nickrobinson251 Oct 28, 2024
adeb30f
Change task-timing flag to require yes/no arg
nickrobinson251 Oct 28, 2024
0f60a27
Rename `timings`->`metrics`, `dequeued`->`started_running`
nickrobinson251 Oct 28, 2024
2ad9bf6
Update task-state-transition comments
nickrobinson251 Oct 28, 2024
1d139f0
Remove unused debug function
nickrobinson251 Oct 28, 2024
635b3c7
Add NEWS
nickrobinson251 Oct 28, 2024
337eda2
Add fallback recording of cpu time in `wait()`
nickrobinson251 Oct 28, 2024
3ac24cb
Instrument async io
nickrobinson251 Oct 28, 2024
631b7bc
Record task cpu time in `wait()`
nickrobinson251 Oct 28, 2024
89759e6
fixup! Change task-timing flag to require yes/no arg
nickrobinson251 Oct 30, 2024
ffc7679
fixup! Record task cpu time in `wait()`
nickrobinson251 Oct 30, 2024
bcd95da
Make task metrics flag const and counters atomic
nickrobinson251 Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions base/condition.jl
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ in line to wake up on `notify`. Otherwise, `wait` has first-in-first-out (FIFO)
"""
function wait(c::GenericCondition; first::Bool=false)
ct = current_task()
record_cpu_time!(ct)
_wait2(c, ct, first)
token = unlockall(c.lock)
try
Expand Down
1 change: 1 addition & 0 deletions base/options.jl
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ struct JLOptions
heap_size_hint::UInt64
trace_compile_timing::Int8
trim::Int8
task_timing::Int8
end

# This runs early in the sysimage != is not defined yet
Expand Down
49 changes: 48 additions & 1 deletion base/task.jl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was going to suggest that it might actually be nice to have an ascii-art state diagram for the state transitions you've added here...

... But I guess for now there'd only be two states: user_time and wait_time, with all the like 15 different arrows you've added.
... so maybe that's actually not super useful right now and it makes more sense to just grep the code base for -> .

Which actually, maybe you can just make them more searchable, by adding a keyword? Like:

# [Task timings]: user_time -task-finished-> wait_time

at the front of each arrow or something?

Original file line number Diff line number Diff line change
Expand Up @@ -810,6 +810,12 @@ end
# runtime system hook called when a task finishes
function task_done_hook(t::Task)
# `finish_task` sets `sigatomic` before entering this function
if t.is_timing_enabled
# user_time -task-finished-> wait_time
now = time_ns()
record_cpu_time!(t, now)
record_wall_time!(t, now)
end
err = istaskfailed(t)
result = task_result(t)
handled = false
Expand Down Expand Up @@ -972,7 +978,13 @@ function enq_work(t::Task)
return t
end

schedule(t::Task) = enq_work(t)
function schedule(t::Task)
# user_time -task-(re)scheduled-> wait_time
if t.is_timing_enabled && t.first_enqueued_at == 0
t.first_enqueued_at = time_ns()
end
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
enq_work(t)
end

"""
schedule(t::Task, [val]; error=false)
Expand Down Expand Up @@ -1018,6 +1030,10 @@ true
function schedule(t::Task, @nospecialize(arg); error=false)
# schedule a task to be (re)started with the given value or exception
t._state === task_state_runnable || Base.error("schedule: Task not runnable")
# user_time -task-(re)scheduled-> wait_time
if t.is_timing_enabled && t.first_enqueued_at == 0
t.first_enqueued_at = time_ns()
end
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
if error
q = t.queue; q === nothing || Base.list_deletefirst!(q::IntrusiveLinkedList{Task}, t)
setfield!(t, :result, arg)
Expand All @@ -1039,6 +1055,7 @@ tasks.
"""
function yield()
ct = current_task()
record_cpu_time!(ct)
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
enq_work(ct)
try
wait()
Expand All @@ -1060,6 +1077,7 @@ Throws a `ConcurrencyViolationError` if `t` is the currently running task.
"""
function yield(t::Task, @nospecialize(x=nothing))
current = current_task()
record_cpu_time!(current)
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
t === current && throw(ConcurrencyViolationError("Cannot yield to currently running task!"))
(t._state === task_state_runnable && t.queue === nothing) || throw(ConcurrencyViolationError("yield: Task not runnable"))
t.result = x
Expand All @@ -1084,6 +1102,9 @@ function yieldto(t::Task, @nospecialize(x=nothing))
elseif t._state === task_state_failed
throw(t.result)
end
if t.is_timing_enabled && t.first_enqueued_at == 0
t.first_enqueued_at = time_ns()
end
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
t.result = x
set_next_task(t)
return try_yieldto(identity)
Expand All @@ -1097,6 +1118,12 @@ function try_yieldto(undo)
rethrow()
end
ct = current_task()
# scheduler -task-started-> user
# scheduler -task-resumed-> user
if ct.is_timing_enabled
# @assert ct.last_dequeued_at == 0
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
ct.last_dequeued_at = time_ns()
end
if ct._isexception
exc = ct.result
ct.result = nothing
Expand All @@ -1110,6 +1137,9 @@ end

# yield to a task, throwing an exception in it
function throwto(t::Task, @nospecialize exc)
if t.is_timing_enabled && t.first_enqueued_at == 0
t.first_enqueued_at = time_ns()
end
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
t.result = exc
t._isexception = true
set_next_task(t)
Expand Down Expand Up @@ -1176,3 +1206,20 @@ if Sys.iswindows()
else
pause() = ccall(:pause, Cvoid, ())
end

function record_cpu_time!(t::Task, stopped_at::UInt64=time_ns())
if t.is_timing_enabled
@assert t.last_dequeued_at != 0
t.cpu_time_ns += stopped_at - t.last_dequeued_at
t.last_dequeued_at = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you ever check/rely anywhere that this is reset to 0?

It seems unnecessary to me (and we want to keep task-switches as cheap as we can, even though this will probably be very cheap since we'll already have the task resident in-cache, but still writes to main memory are slowish). And maybe someone would even want to look at the last-ran-time even for a sleeping task if they were trying to debug things? Not sure though.

end
return t
end

function record_wall_time!(t::Task, done_at::UInt64=time_ns())
if t.is_timing_enabled
@assert t.first_enqueued_at != 0
t.wall_time_ns = done_at - t.first_enqueued_at
end
return t
end
58 changes: 58 additions & 0 deletions base/timing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,64 @@ end
# total time spend in garbage collection, in nanoseconds
gc_time_ns() = ccall(:jl_gc_total_hrtime, UInt64, ())

"""
task_cpu_time_ns(t::Task) -> UInt64
Return the total nanoseconds that the task `t` has spent running.
This metric is only updated when the task yields or completes.
See also [`task_wall_time_ns`](@ref).
Will be `UInt64(0)` if task timings are not enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this method should throw if timings are not enabled for this task instead of returning 0, which is a valid return value. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or missing or nothing are also possible options for that case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO you should not use unsigned integers for representing positive values - you should use Int64 for this instead.

It's better in julia than it is in C, since julia's defaults are less dangerous, but using unsigned ints for positive numeric values is a consistent source of bugs in C programs, and i think should be still frowned upon in julia. (https://jacobegner.blogspot.com/2019/11/unsigned-integers-are-dangerous.html)

IMO Unsigned ints should be used only for things like hashes, identifiers, addresses, and possibly for indexes but even for indexes i'd use signed ints.

But anyway, another benefit is that this also gives -1 as a possible return value for the unset case.

See [`Base.task_timing`](@ref).
!!! note "This metric is from the Julia scheduler"
A task may be running on an OS thread that is descheduled by the OS
scheduler, this time still counts towards the metric.
!!! compat "Julia 1.12"
This method was added in Julia 1.12.
"""
function task_cpu_time_ns(t::Task)
return t.cpu_time_ns
end

"""
task_wall_time_ns(t::Task) -> UInt64
Return the total nanoseconds that the task `t` was runnable.
This is the time since the task entered the run queue until the time at which it completed,
or until the current time if the task has not yet completed.
See also [`task_cpu_time_ns`](@ref).
Will be `UInt64(0)` if task timings are not enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for consistency, although in this case 0 is clearly not a valid return value.

See [`Base.task_timing`](@ref).
!!! compat "Julia 1.12"
This method was added in Julia 1.12.
"""
function task_wall_time_ns(t::Task)
# TODO: report up til current time if not done? too racy?
# return istaskdone(t) ? t.wall_time_ns : time_ns() - t.first_enqueued_at
return t.wall_time_ns
end

"""
Base.task_timing(::Bool)
Enable or disable the collection of per-task timing information.
Task created when Base.task_timing(true) is in effect will have [`task_cpu_time_ns`](@ref)
and [`task_wall_time_ns`](@ref) timing information available.
"""
function task_timing(b::Bool)
if b
ccall(:jl_task_timing_enable, Cvoid, ())
else
# TODO: prevent decrementing the counter below zero
ccall(:jl_task_timing_disable, Cvoid, ())
end
return nothing
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you've put effort into making this flag reentrant, maybe you want to provide a with_task_timing_enabled(f) variant too? Not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, for this PR, i don't think so, because we don't have a use for it right now, and because (as of this PR) the only way to write such a function would be enabling the task-timings process-wide, whereas ideally i think such a function would only enable task-timings for task created by f


"""
Base.gc_live_bytes()
Expand Down
4 changes: 4 additions & 0 deletions doc/man/julia.1
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,10 @@ If --trace-compile is enabled show how long each took to compile in ms
--trace-dispatch={stderr|name}
Print precompile statements for methods dispatched during execution or save to stderr or a path.

.TP
--task-timing
Enabled the collection of per-task timing metrics.
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved

.TP
-image-codegen
Force generate code in imaging mode
Expand Down
1 change: 1 addition & 0 deletions doc/src/manual/command-line-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ The following is a complete list of command-line switches available when launchi
|`--trace-compile={stderr\|name}` |Print precompile statements for methods compiled during execution or save to stderr or a path. Methods that were recompiled are printed in yellow or with a trailing comment if color is not supported|
|`--trace-compile-timing` |If --trace-compile is enabled show how long each took to compile in ms|
|`--trace-dispatch={stderr\|name}` |Print precompile statements for methods dispatched during execution or save to stderr or a path.|
|`--task-timing` |Enable tasks the collection of per-task timing metrics|
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
|`--image-codegen` |Force generate code in imaging mode|
|`--permalloc-pkgimg={yes\|no*}` |Copy the data section of package images into memory|
|`--trim={no*\|safe\|unsafe\|unsafe-warn}` |Build a sysimage including only code provably reachable from methods marked by calling `entrypoint`. The three non-default options differ in how they handle dynamic call sites. In safe mode, such sites result in compile-time errors. In unsafe mode, such sites are allowed but the resulting binary might be missing needed code and can throw runtime errors. With unsafe-warn, such sites will trigger warnings at compile-time and might error at runtime.|
Expand Down
3 changes: 3 additions & 0 deletions src/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -874,6 +874,9 @@ static NOINLINE void _finish_julia_init(JL_IMAGE_SEARCH rel, jl_ptls_t ptls, jl_
post_boot_hooks();
}

if (jl_options.task_timing) {
jl_atomic_fetch_add(&jl_task_timing_enabled, 1);
}
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
if (jl_base_module == NULL) {
// nthreads > 1 requires code in Base
jl_atomic_store_relaxed(&jl_n_threads, 1);
Expand Down
27 changes: 27 additions & 0 deletions src/jlapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -809,6 +809,33 @@ JL_DLLEXPORT uint64_t jl_cumulative_recompile_time_ns(void)
return jl_atomic_load_relaxed(&jl_cumulative_recompile_time);
}

/**
* @brief Enable per-task timing.
*/
JL_DLLEXPORT void jl_task_timing_enable(void)
{
// Increment the flag to allow reentrant callers.
jl_atomic_fetch_add(&jl_task_timing_enabled, 1);
}

/**
* @brief Disable per-task timing.
*/
JL_DLLEXPORT void jl_task_timing_disable(void)
{
jl_atomic_fetch_add(&jl_task_timing_enabled, -1);
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
}

// TODO: remove this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the TODO comment, but do please remove all TODOs.

JL_DLLEXPORT void jl_is_task_timing_enabled(void)
{
jl_atomic_load_relaxed(&jl_task_timing_enabled);
}
JL_DLLEXPORT void jl_is_compile_timing_enabled(void)
{
jl_atomic_load_relaxed(&jl_measure_compile_time_enabled);
}

/**
* @brief Retrieve floating-point environment constants.
*
Expand Down
7 changes: 7 additions & 0 deletions src/jloptions.c
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ JL_DLLEXPORT void jl_init_options(void)
0, // heap-size-hint
0, // trace_compile_timing
0, // trim
0, // task_timing
};
jl_options_initialized = 1;
}
Expand Down Expand Up @@ -265,6 +266,7 @@ static const char opts_hidden[] =
" comment if color is not supported\n"
" --trace-compile-timing If --trace-compile is enabled show how long each took to\n"
" compile in ms\n"
" --task-timing Enable collection of per-task timing data.\n"
" --image-codegen Force generate code in imaging mode\n"
" --permalloc-pkgimg={yes|no*} Copy the data section of package images into memory\n"
" --trim={no*|safe|unsafe|unsafe-warn}\n"
Expand Down Expand Up @@ -296,6 +298,7 @@ JL_DLLEXPORT void jl_parse_opts(int *argcp, char ***argvp)
opt_trace_compile,
opt_trace_compile_timing,
opt_trace_dispatch,
opt_task_timing,
opt_math_mode,
opt_worker,
opt_bind_to,
Expand Down Expand Up @@ -375,6 +378,7 @@ JL_DLLEXPORT void jl_parse_opts(int *argcp, char ***argvp)
{ "trace-compile", required_argument, 0, opt_trace_compile },
{ "trace-compile-timing", no_argument, 0, opt_trace_compile_timing },
{ "trace-dispatch", required_argument, 0, opt_trace_dispatch },
{ "task-timing", no_argument, 0, opt_task_timing },
{ "math-mode", required_argument, 0, opt_math_mode },
{ "handle-signals", required_argument, 0, opt_handle_signals },
// hidden command line options
Expand Down Expand Up @@ -964,6 +968,9 @@ JL_DLLEXPORT void jl_parse_opts(int *argcp, char ***argvp)
else
jl_errorf("julia: invalid argument to --trim={safe|no|unsafe|unsafe-warn} (%s)", optarg);
break;
case opt_task_timing:
jl_options.task_timing = 1;
break;
default:
jl_errorf("julia: unhandled option -- %c\n"
"This is a bug, please report it.", c);
Expand Down
1 change: 1 addition & 0 deletions src/jloptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ typedef struct {
uint64_t heap_size_hint;
int8_t trace_compile_timing;
int8_t trim;
int8_t task_timing;
} jl_options_t;

#endif
18 changes: 14 additions & 4 deletions src/jltypes.c
Original file line number Diff line number Diff line change
Expand Up @@ -3744,7 +3744,7 @@ void jl_init_types(void) JL_GC_DISABLED
NULL,
jl_any_type,
jl_emptysvec,
jl_perm_symsvec(16,
jl_perm_symsvec(21,
"next",
"queue",
"storage",
Expand All @@ -3760,8 +3760,13 @@ void jl_init_types(void) JL_GC_DISABLED
"_state",
"sticky",
"_isexception",
"priority"),
jl_svec(16,
"priority",
"is_timing_enabled",
"first_enqueued_at",
"last_dequeued_at",
"cpu_time_ns",
"wall_time_ns"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these go in a separate struct called metrics or something @vtjnash? We won't be able to initialize that struct type before this, so it would have to be jl_any_type, but that would mean we'd have to alloc this in jl_new_task. I don't know how/if we can do an embedded struct here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a separate struct inlined here, but I don't see much value to that. More valuable would probably be marking each of these as atomic though (in jltypes.c) so that observing them isn't UB. Most reads can probably be monotonic, but some writes are probably release and some reads might be acquires

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay on the separation concern. And yeah, I guess it is necessary for these to be atomic. Will discuss with @nickrobinson251.

jl_svec(21,
jl_any_type,
jl_any_type,
jl_any_type,
Expand All @@ -3777,7 +3782,12 @@ void jl_init_types(void) JL_GC_DISABLED
jl_uint8_type,
jl_bool_type,
jl_bool_type,
jl_uint16_type),
jl_uint16_type,
jl_bool_type,
jl_uint64_type,
jl_uint64_type,
jl_uint64_type,
jl_uint64_type),
jl_emptysvec,
0, 1, 6);
XX(task);
Expand Down
10 changes: 10 additions & 0 deletions src/julia.h
Original file line number Diff line number Diff line change
Expand Up @@ -2289,6 +2289,16 @@ typedef struct _jl_task_t {
// uint8_t padding1;
// multiqueue priority
uint16_t priority;
// flag indicating whether or not this to measure the wall and cpu time of this task
uint8_t is_timing_enabled;
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
// timestamp this task first entered the run queue (TODO: int32 of ms instead?)
uint64_t first_enqueued_at;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's valuable to also have created_at?

It might be interesting in some cases to see the difference between the created-at and first-scheduled-at timestamps? Maybe someone would be surprised to see that gap being large? Not sure though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe, but i didn't particularly want to add things i wasn't adding a public API for... what do you think? worth adding? if so, worth adding a public API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 that it should either have a public API or not exist at all.

🤔 My instinct was to say "just add it," but i know that we want to keep the struct smallish if possible for data-locality so I think you're right to leave it off unless we have a use for it.

// timestamp this task was most recently scheduled to run
uint64_t last_dequeued_at;
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
// time this task has spent running; updated when it yields
uint64_t cpu_time_ns;
nickrobinson251 marked this conversation as resolved.
Show resolved Hide resolved
// time between first entering the run queue and being done/failed.
uint64_t wall_time_ns;

// hidden state:
// cached floating point environment
Expand Down
3 changes: 3 additions & 0 deletions src/julia_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,9 @@ extern JL_DLLEXPORT _Atomic(uint8_t) jl_measure_compile_time_enabled;
extern JL_DLLEXPORT _Atomic(uint64_t) jl_cumulative_compile_time;
extern JL_DLLEXPORT _Atomic(uint64_t) jl_cumulative_recompile_time;

// Global *atomic* integer controlling *process-wide* task timing.
extern JL_DLLEXPORT _Atomic(uint8_t) jl_task_timing_enabled;

#define jl_return_address() ((uintptr_t)__builtin_return_address(0))

STATIC_INLINE uint32_t jl_int32hash_fast(uint32_t a)
Expand Down
Loading
Loading