-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add profiling of already running tasks via SIGINFO/SIGUSR1 #43179
Add profiling of already running tasks via SIGINFO/SIGUSR1 #43179
Conversation
What do you think of using different keys for this, e.g. Edit: if there is nothing running at the moment (i.e. the REPL is just displaying its prompt), then |
Emacs reacts to SIGUSR2 to drop to a debugger to interrupt a stuck instance, maybe julia could also use this signal to trigger the profile (eg to interrupt a noninteractive session) |
@jonas-schulze what you're suggesting does suit the Regarding I'd be keen to see what other people think trying this out. It seems to work pretty well, and I just added the report to the sysimage REPL precompiler so that the report is prepared quickly first time. |
Also, with the last commit, if the code has no yield point then the profile is still done, but the report won't show until the process finishes/is interrupted |
Another option would be to add the profiler output after the standard output of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
ba4a902
to
17a383d
Compare
6a38e9c
to
7ffae07
Compare
dada8ab
to
7901e4c
Compare
I think the linux64 failure is real, but I don't understand what's triggering it, and why only that platform
It doesn't happen locally on linux64, neither with nor without rr. It's repeatable
|
7901e4c
to
de15e16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there is a current @profile
running and I trigger the 1 second profile?
On BSDs the stack trace will be printed only, like it currently is for Basically it won't trigger a profile if it's already profiling. |
b279cd8
to
3326631
Compare
I think I figured out the linux64 problem.. |
42b89da
to
30b1109
Compare
Bump @vtjnash 🙏 |
I'm adding this as a 1.8 milestone as I think it would be good to get in, if approved of |
db3fa73
to
0580c3a
Compare
Triage was supportive of this, but made the point that the print trigger mechanism should live in Profile rather than Base, so that is now the case. |
@JeffBezanson @vtjnash, given the discussion in triage. It would be great if this could get into 1.8 🙏🏻. Please let me know if there are blocking issues. |
660aa19
to
5a6455a
Compare
Great, thanks @vchuravy |
…uliaLang#43179)" This reverts commit a9aad97.
…uliaLang#43179)" (JuliaLang#44184) This reverts commit a9aad97.
…uliaLang#43179)" (JuliaLang#44184) This reverts commit a9aad97.
…uliaLang#43179)" (JuliaLang#44184) This reverts commit a9aad97.
Updates:
SIGINFO
on BSD platforms viactrl-t
andSIGUSR1
on other platforms. Unfortunatelyctrl-z
was actually already used by some to force terminate julia, which is a more natural use ofSIGTSTP
and there's no other cross-platform keymap available. So on MacOS & FreeBSD you canctrl-t
. On Linux it's triggered bySIGUSR1
e.g.kill -USR1 $julia_pid
SIGINFO/SIGUSR1
already prints.Say you're in interactive mode, and you've started running some mixed async/threaded code, and it's taking too long, or something odd is happening, currently one option for debugging is to
ctrl-c
to interrupt the code and get a stacktrace, but that's a singular point in time, and can obscure things happening in different threads/tasks, and it terminates the code.Profiling is useful, but again would require you to stop the code, and re-run having launched the code with
@profile
. Often you want to see what's happening this time around.This PR is a demo/RFC to introduce
ctrl-z
to trigger an async 1-second Profile, and print the report without stopping the code. The profile report shows the new per-thread breakdown.Because it doesn't stop the code it can be run multiple times at the user's discretion, perhaps to profile the different stages of a long running process?
Some thoughts:
signals-unix.c
.Profile.set_peek_duration(2.5)
Here, the code is run, then a few seconds later I press
ctrl-z
. The report is printed within a few seconds in this case