Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinstall GAP SIGINT handler when Julia is started #3256

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sebasguts
Copy link
Member

No description provided.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0003%) to 85.001% when pulling 3a2534e on sebasguts:sg/reenable_signal_handling_julia_gc into 3327450 on gap-system:master.

@fingolfin fingolfin added topic: kernel release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes backport-to-4.10 topic: julia Julia GC integration and related matters labels Feb 1, 2019
@@ -659,6 +660,7 @@ void InitBags(UInt initial_size, Bag * stack_bottom, UInt stack_align)
max_pool_obj_size = jl_gc_max_internal_obj_size();
jl_gc_enable_conservative_gc_support();
jl_init();
SyInstallAnswerIntr();
Copy link
Member

@fingolfin fingolfin Feb 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so let's see, this is the code for SyInstallAnswerIntr:

void SyInstallAnswerIntr ( void )
{
    if ( signal( SIGINT, SIG_IGN ) != SIG_IGN )
    {
        signal( SIGINT, syAnswerIntr );
        siginterrupt( SIGINT, 0 );
    }
}

This doesn't do anything if there already is a SIGINT handler installed, does it? But doesn't Julia install a SIGINT handler? Since I assume you tested this, the answer seems to be no; yet Julia's doc/src/devdocs/init.md says this:

Finally sigint_handler() is hooked up to SIGINT and calls jl_throw(jl_interrupt_exception).

Digging some more into signal-unix.c also reveals this:

        if (sig == SIGINT) {
            if (jl_ignore_sigint()) {
                continue;
            }
            else if (exit_on_sigint) {
                critical = 1;
            }
            else {
                jl_try_deliver_sigint();
                continue;
            }
        }

Which suggests to me that it does handle SIGINT, just not via signal() but rather via sigprocmask...
(Funny story on the side: On macOS, the relevant C file is signals-mach.c, which also defines e.g. static void jl_try_deliver_sigint(), but never calls it, and seems to contain no provisions at all to handle SIGINT ?!? And finally signals-win.c does have some, but written in a completely different style?!?). Of course libuv might also be involved somehow?

All of this is to say: I don't quite understand why the above works; and I am slightly worried that it might cause unexpected problems... Like, what happens if our custom SIGINT handler is triggered while we are inside Julia code? Could this leave Julia in an undefined state?

So I guess what I'd like to know is what kind of tests you performed with this modified code... Like, I assume if one starts GAP and loads JuliaInterface in it, but does not use it, everything works fine (Ctrl-C in GAP code enters a break loop). Correct? And what happens if one presses Ctrl-C while Julia code is running? And what happens when one starts Julia and loads GAPJulia, and then presses Ctrl-C, either while Julia code is running, or while GAP code is running? (And I guess ultimately, we also need to think about more deeply nested callchains with ...->GAP code->Julia code->GAP->Julia->...)

None of this is meant to kill of this PR, I simply would like to understand better what it does and why it works, and then perhaps we can put some of it in a comment and/or the commit message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem that you may be running into is that on macOS, Julia does not install a signal handler, but goes directly for the Mach system calls. When they do that, normal signal handling is subsequently ignored. There are ways to insert yourself in the signal handling process in Julia, but I haven't delved too deep into those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that signals-mach.c is included by signals-posix.c, and now feel dumb for not considering this possibility earlier 🤣

Copy link
Member

@frankluebeck frankluebeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This is a very useful first step to improve the SIGINT behaviour.
Works for me runnig GAP code in a GAP that uses julia-gc, in both cases, with JuliaInterface loaded or not. Ctrl-C brings me in a brk-loop as usual in GAP.

The TODO becomes obvious when I load JuliaInterface and start a long running julia-function, then Ctrl-C does either nothing, or if I type many of them quickly in a row, I get a segfault.

@sebasguts
Copy link
Member Author

@fingolfin Indeed, I tested it, although, since this was the first quick fix that came to my mind, I did not dig too deeply into it.

As @frankluebeck said, yes, it works, and you already pointed out the reason for it. But it seems like it more or less works by accident, so this maybe needs work. I think it would be still a good idea to reinstall the GAP handler. Using sigprocmask seems a bit of an overkill here, but I have to admit, this is the first time I even hear about this command, and I do not quite understand it yet. It seems to set all signal handling for a process at once, is that right?

When running Julia code, pressing ctrl+c does not do anything, it waits for returning to GAP. So a longer Julia loop would continue, just like GAP C-Code.

I am planning to replace the SIGINT handler in JuliaInterface, by something that both calls InterruptExecStat() and throws InterruptExeption. Both is critical here, though, since it will need cleanup afterwards. But that should be solved in JuliaInterface, as it is the place that calls Julia code, not GAP itself. For this PR, I would not worry what happens when Julia code is called.

@sebasguts
Copy link
Member Author

Quick comment (regarding my plan): Julia handles SIGINT by setting jl_sigint_passed = 1, GAP by STATE(CurrExecStatFuncs) = IntrExecStatFuncs. A common handler should probably set both, and make sure that it is cleaned up afterwards (however that should work). But again, that should be part of JuliaInterface.

@ThomasBreuer
Copy link
Contributor

@sebasguts I do not understand your statement

When running Julia code, pressing ctrl+c does not do anything, it waits for returning to GAP. So a longer Julia loop would continue, just like GAP C-Code.

Do you mean the current/intended behavior when Julia is called from GAP?

Julia without GAP behaves differently, I get an InterruptException when I enter Ctrl-C while some Julia function is running.
If this is not intended when the Julia function was called from GAP,
would it make sense to provide a Julia function that checks whether the user has entered Ctrl-C,
such that one can raise an exception in this situation, and eventually return to the GAP (or break) prompt?

@sebasguts
Copy link
Member Author

Do you mean the current/intended behavior when Julia is called from GAP?

This is the intended behaviour after this PR. Currently, the behaviour is as you describe it.

After this PR, the behaviour will change to not returning until the execution goes back into GAP code (on whatever level of nesting).

I intent to restore the behaviour of having the InterruptExecption thrown with code in JuliaInterface (which has yet to be written).

@fingolfin
Copy link
Member

@sebasguts some remarks:

  • I did not mean to suggest that you or GAP should use sigprocmask, I only mentioned it up as part of me trying to understand the code on the Julia side, and I thought it best to document this here, for future reference (mine at least, though perhaps its useful to others).
  • as to the new "intended" behaviour: why exactly? Wouldn't it be better to let Julia throw that exception as before, but then we catch the exception, and handle it by invoking ErrorMayQuit or ErrorQuit or something like that?

@gap-system gap-system deleted a comment from coveralls Feb 6, 2019
@sebasguts
Copy link
Member Author

sebasguts commented Feb 6, 2019

I did not mean to suggest that you or GAP should use sigprocmask, I only mentioned it up as part of me trying to understand the code on the Julia side, and I thought it best to document this here, for future reference (mine at least, though perhaps its useful to others).

With here you mean as a comment to the line I added?

as to the new "intended" behaviour: why exactly? Wouldn't it be better to let Julia throw that exception as before, but then we catch the exception, and handle it by invoking ErrorMayQuit or ErrorQuit or something like that?

I wanted to keep the changes to GAP as minimal as possible. Without JuliaInterface, GAP does not run Julia code atm, so I thought restoring the previous behaviour would be the easiest way.
I am currently trying to get the common signal handler in JuliaInterface working, and it might be a better solution here to make GAP catch the InterruptException. But that would mean we have to integrate more parts of Julia into GAP. Would that be okay?

@olexandr-konovalov olexandr-konovalov added this to the GAP 4.10.2 milestone Jun 12, 2019
@olexandr-konovalov
Copy link
Member

Any updates @sebasguts ?

@olexandr-konovalov olexandr-konovalov modified the milestones: GAP 4.10.2, GAP 4.10.3 Jun 16, 2019
@ThomasBreuer
Copy link
Contributor

@fingolfin @frankluebeck Is there a chance to have this feature (avoid killing GAP on hitting Ctrl-C) in the next released version of GAP? This would make it easier to recommend the Julia integration to GAP users.

@fingolfin
Copy link
Member

@ThomasBreuer I'll look into getting this into shape when I am back from vacation. We could also work on it in Lambrecht.

@fingolfin fingolfin force-pushed the sg/reenable_signal_handling_julia_gc branch from 3a2534e to b5541bd Compare September 5, 2019 12:08
@codecov
Copy link

codecov bot commented Sep 5, 2019

Codecov Report

Merging #3256 into master will increase coverage by 0.49%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3256      +/-   ##
==========================================
+ Coverage   84.66%   85.16%   +0.49%     
==========================================
  Files         698      696       -2     
  Lines      345819   344234    -1585     
==========================================
+ Hits       292791   293159     +368     
+ Misses      53028    51075    -1953
Impacted Files Coverage Δ
src/julia_gc.c 87.23% <100%> (+4.79%) ⬆️
src/compiled.h 0% <0%> (-100%) ⬇️
lib/csetpc.gi 42.69% <0%> (-45.51%) ⬇️
lib/gprdpc.gi 45.74% <0%> (-42.2%) ⬇️
src/libgap-api.c 25.87% <0%> (-39.77%) ⬇️
lib/autsr.gi 35.35% <0%> (-33.5%) ⬇️
src/sctable.c 70.14% <0%> (-17.46%) ⬇️
lib/gpfpiso.gi 64.14% <0%> (-14.08%) ⬇️
src/hpc/aobjects.h 46.66% <0%> (-13.34%) ⬇️
lib/ctblsolv.gi 59.62% <0%> (-12.59%) ⬇️
... and 364 more

@olexandr-konovalov olexandr-konovalov modified the milestones: GAP 4.10.3, GAP 4.11.1 Apr 28, 2020
@fingolfin
Copy link
Member

Some remarks and update:

  1. In GAP.jl 0.4.0 and later, the "recommended" way to start GAP is via a gap.sh script that actually starts Julia, and in there opens GAP prompt
  2. one can also open a GAP prompt from any Julia session which has loaded GAP.jl, via GAP.prompt()
  3. in that function, I simply ccall into GAP's SyInstallAnswerIntr, so this PR here is not needed
  4. however, this still doesn't work quite right when you have Julia code running ; or more complex calling stacks, like GAP -> Julia -> GAP -> Julia -> GAP. The way I would handle this (and meant to handle it in SingularInterface) back in the day is a bit more complex but also safer than what @sebasguts suggested above: namely, at every "barrier" between the two systems, we'll have to insert code which "catches and rethrows" errors from the other system. That is: the code in GAP which calls into Julia should catch Julia InterruptException, and if it gets one, trigger a GAP "interrupt"; this will either trigger a break loop, or percolate one level up, to the previous "boundary": this time, it'll be Julia code calling to GAP which needs to "catch" the error thrown by GAP, and turn it in to an InterruptException.
  5. For this, we ought to adjust the GAP kernel to make it easy to distinguish between a user interrupt and a genuine error (that shouldn't be hard, just some work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: julia Julia GC integration and related matters topic: kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants