-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP thread cpu time #175
WIP thread cpu time #175
Conversation
uint64_t scheduler_time; | ||
uint64_t lock_spin_time; | ||
uint64_t gc_time; | ||
} jl_timing_tls_states_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: should this be like GC_Num and have a corresponding struct on the Julia side, so on that side we work with the struct rather than individual numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
downside: any user-facing struct is impossible to expand, so probably just want to be exposing functions that return numbers... could still be passing data from C -> Julia side as a struct but idk if that gains us much tbh
base/timing.jl
Outdated
thread_up_time() = ccall(:jl_thread_up_time, UInt64, ()) | ||
thread_user_time() = ccall(:jl_thread_user_time, UInt64, ()) | ||
# thread_user_time(tid::Integer) = ccall(:jl_thread_user_time, UInt64, (Cint,), Cint(tid)) | ||
# function thread_user_time(pool::Symbol=:all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm thinking an interface like this, where you can optionally return stats by threadpool makes sense?
we also need to change this to be able to return specific stats (like sleep_time
, gc_time
etc.) rather than just user_time
and planning to do the aggregation on the Julia side, as you can see
@@ -523,6 +527,7 @@ JL_DLLEXPORT jl_task_t *jl_task_get_next(jl_value_t *trypoptask, jl_value_t *q, | |||
assert(jl_atomic_load_relaxed(&ptls->sleep_check_state) == not_sleeping); | |||
uv_mutex_unlock(&ptls->sleep_lock); | |||
JULIA_DEBUG_SLEEPWAKE( ptls->sleep_leave = cycleclock() ); | |||
ptls->timing_tls.sleep_time += jl_hrtime() - tsleep0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently sleep_time
is a subset of scheduler_time
(so we might want to rename scheduler_time
to make that clear, or do the extra accounting so that we stop accumulating scheduler_time
when we start accumulating sleep_time
?)
src/threading.c
Outdated
{ | ||
jl_ptls_t ptls = jl_current_task->ptls; | ||
jl_timing_tls_states_t *timing = &ptls->timing_tls; | ||
return jl_thread_up_time() - timing->gc_time - timing->lock_spin_time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be
return jl_thread_up_time() - timing->gc_time - timing->lock_spin_time; | |
return jl_thread_up_time() - timing->gc_time - timing->lock_spin_time - timing->scheduler_time; |
but also maybe this isn't the right API, and we should instead have a jl_thread_timing_stats(int tid)
that populates a struct and do all the arithmetic on the Julia side
src/threading.c
Outdated
while (1) { | ||
if (owner == NULL && jl_atomic_cmpswap(&lock->owner, &owner, self)) { | ||
lock->count = 1; | ||
jl_profile_lock_acquired(lock); | ||
jl_record_lock_spin_time(jl_hrtime() - t0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently time in runtime-internal spin locks and Julia-side SpinLock
s accumulate into the same field... idk if we want to separate those (i guess some use of Julia-side SpinLock
s are "internal" not just in user-code, so i'm leaning towards keeping accumulating them both into the same field)
static uint64_t jl_thread_start_time; | ||
void jl_set_thread_start_time(void) | ||
{ | ||
jl_thread_start_time = jl_hrtime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is global shared by all threads, which technically isn't correct since threads will start at very slightly different times, but i think this is fine at least for a first pass?
return task; | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the fast-path for task-switches (i think?), but i think we've concluded this shouldn't add too much overhead (given jl_hrtime
is a vdso
call)... still need to verify that experimentally though
@@ -3798,6 +3798,8 @@ JL_DLLEXPORT void jl_gc_collect(jl_gc_collection_t collection) | |||
jl_safepoint_end_gc(); | |||
jl_gc_state_set(ptls, old_state, JL_GC_STATE_WAITING); | |||
JL_PROBE_GC_END(); | |||
// Time how long GC took. | |||
ptls->timing_tls.gc_time += jl_hrtime() - t1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GC time for the thread coordinating the GC
@@ -173,6 +176,7 @@ void jl_safepoint_wait_gc(void) | |||
uv_cond_wait(&safepoint_cond, &safepoint_lock); | |||
uv_mutex_unlock(&safepoint_lock); | |||
} | |||
ptls->timing_tls.gc_time = jl_hrtime() - t0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GC time for the other threads
src/julia_threads.h
Outdated
uint64_t start_time; | ||
uint64_t sleep_time; | ||
uint64_t scheduler_time; | ||
uint64_t lock_spin_time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to do: add compile_time
(which could be a subset of lock_spin_time
i guess or we could stop accumulating lock_spin_time when compilation_time starts?)
in future maybe we could split lock_spin_time to have timing for a few important internal locks (like the codegen_lock) but i think that can be follow-up work?
Stdlib: Tar URL: https://github.com/JuliaIO/Tar.jl.git Stdlib branch: master Julia branch: master Old commit: 81888a3 New commit: 1114260 Julia version: 1.12.0-DEV Tar version: 1.10.0(Does not match) Bump invoked by: @StefanKarpinski Powered by: [BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl) Diff: JuliaIO/Tar.jl@81888a3...1114260 ``` $ git log --oneline 81888a3..1114260 1114260 Accept other string types for all string arguments (fix #179) (#180) a2e39d6 Bump julia-actions/cache from 1 to 2 (#178) 152d12e Bump julia-actions/setup-julia from 1 to 2 (#177) 5012536 Fix Codecov (#176) 9b5460b Add `public` declarations using `eval` (#175) 4e9d73a Add docstring for Tar module (#173) 38a4bf4 Bump codecov/codecov-action from 3 to 4 (#172) 166deb3 [CI] Switch to `julia-actions/cache` (#171) d0085d8 Hardcode doc edit backlink (#164) 7e83ed7 [NFC] fix some wonky formatting (#168) 6269b5b Bump actions/checkout from 3 to 4 (#163) ``` Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
PR Description
What does this PR do?
WIP on https://relationalai.atlassian.net/browse/RAI-29088
Checklist
Requirements for merging:
port-to-*
labels that don't apply.