-
Notifications
You must be signed in to change notification settings - Fork 3k
Make trace:info(S, MFA, call_time | call_memory) not block all schedulers #10207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
CT Test Results 3 files 135 suites 49m 37s ⏱️ Results for commit 357949a. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
Don't spend time and heap memory on items not asked for.
in preparation for block-less trace:info.
Save memory in most cases, but impose extra cpu (time) when tracing.
It should always be erts_active_bp_index ^ 1. Avoid tiny race when we had to change two atomics. We only call erts_staging_bp_ix() when we change trace settings.
84b9e53
to
5fee623
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the implementation of trace:info/3
and erlang:trace_info/2
to make them non-blocking when collecting call_time
and call_memory
counters, addressing the issue of these functions blocking all scheduler threads.
- Refactored breakpoint data structures to eliminate the
erts_staging_bp_index
variable - Implemented non-blocking trace info collection using code barriers and process suspension/resumption
- Reorganized hash table structures and functions for call time/memory tracing
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
erts/emulator/test/trace_call_time_SUITE.erl | Added new test cases to verify non-blocking behavior and robustness against process termination |
erts/emulator/nifs/common/prim_tty_nif.c | Fixed assertion to include zero in valid character range |
erts/emulator/beam/erl_trace.h | Removed staging breakpoint index variable |
erts/emulator/beam/erl_trace.c | Removed staging breakpoint index variable |
erts/emulator/beam/erl_init.c | Moved trace initialization after emulator initialization |
erts/emulator/beam/erl_bif_trace.c | Major refactor to implement non-blocking trace info collection with trapping mechanisms |
erts/emulator/beam/beam_bp.h | Refactored breakpoint data structures and function signatures |
erts/emulator/beam/beam_bp.c | Implemented new hash table management and non-blocking collection algorithms |
erts/emulator/beam/atom.names | Added new atom for trace info finish export |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Instead use this strategy 0. Seize code mod permission (as before) 1. Allocate temporary zeroed hashes for any traced calls that may happen during the call to trace:info. 2. Thread progress 3. Switch bp index to make the temp hashes active. 4. Thread progress. 5. Collect stats from the real hashes that are now unused and stable. 6. Switch back bp index to make the real hashes active again. 7. Thread progress. 8. Consolidate by collecting stats from the temp hashes into the active generation. 9. Deallocate the temp hashes and make the two halves of the breakpoint identical again using the same real hashes. 10. Build result from stats collected in step 5 11. Release code mod permission
baded75
to
f41cc8d
Compare
c9cfc5e
to
c8cb8dd
Compare
with some talk about sessions and call_time and call_memory.
c8cb8dd
to
357949a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
default: | ||
goto error; | ||
erts_exit(ERTS_ABORT_EXIT, "Invalid key\n"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message 'Invalid key' is not helpful for debugging. It should specify what constitutes a valid key or include the actual invalid key value that was encountered.
Copilot uses AI. Check for mistakes.
erts_refc_init(&bdt->refc, 1); | ||
for (Uint i = 0; i < n; i++) { | ||
bp_hash_init(&(bdt->hash[i]), 32); | ||
bdt->threads[i] = NULL; // allocate on demand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ style comment should be replaced with C style comment /* allocate on demand */ to maintain consistency with the C codebase.
bdt->threads[i] = NULL; // allocate on demand | |
bdt->threads[i] = NULL; /* allocate on demand */ |
Copilot uses AI. Check for mistakes.
sys_memzero(hash, size); | ||
hash->n = n; | ||
hash->used = 0; | ||
|
||
hash->item = (bp_data_trace_item_t *)Alloc(size); | ||
sys_memzero(hash->item, size); | ||
|
||
for(i = 0; i < n; ++i) { | ||
hash->item[i].pid = NIL; | ||
for(Uint i = 0; i < n; ++i) { | ||
hash->buckets[i].pid = NIL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hash table allocation uses sys_memzero
to zero the entire structure including buckets, but then immediately sets hash->buckets[i].pid = NIL
in a loop. This double initialization is inefficient - either use calloc-equivalent allocation or skip the memzero and only initialize necessary fields.
Copilot uses AI. Check for mistakes.
Another step in the quest of purging ERTS from BIFs that block all scheduler threads.
This affects both the new
trace:info/3
and the olderlang:trace_info/2
when they are called to collectcall_time
and/orcall_memory
counters.