-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abuse of TRY_READ for general error handling can lead to crashes and other problems #2487
Comments
Long rant follows, sadly not yet helpful -- but writing it helped me clear my mind in parts, and perhaps some of it is helpful for some people anyway. I'd like to try and write some documentation based on what I learned one of these days, and then hopefully we can together refine it and learn from each other's partial knowledge of the system to piece it all together. And perhaps e.g. @stevelinton knows / remembers some more things and can add them. First off, I named Back when I turned I believe the original purpose of
Originally (as in: back in 1997), with just two exceptions, And that's essentially how it was and where it stopped; the only uses of As for Finally, back then, But then And so, I think today, the error handling system of GAP is a rather complicated mystery. I recently started to understand it better, which lead to some fixes where I put I'll think about the issue and write some more later, but for now, have to go. |
This imitates a change already in the master branch.
This imitates a change already in the master branch.
A workaround to prevent this crash (which imitates a change already in the master branch) has been added to |
OutputLog
points to a stream which uses a stream
I changed the title. To elaborate a bit further on the problem here: But "rethrowing" only makes sense if But if the call to the REPL was The crash does not normally happen in master because the "rethrow" part does not get activated; but it would be, if the recursive call to But here it gets activated, because So this mess has to be cleaned up. While looking at that code, I noticed another problem: More generally: handling errors while currently handling an error is a rather delicate matter... |
If one replaces the last line of the code at the top by just |
While working on Sagemath interface to libgap, I got bitten by infinite recursion that happens if Sagemath's (based on GAP 4.8.6) Not sure whether this needs another issue. |
I've spent quite a bit of time understanding GAP's error handling and have found many of the same problems already discussed on this issue. I think overall the situation is not so bad, there are just a few inconsistencies and areas where a bit more care is needed. I have some ideas for how to improve and clean up the situation, but it will be a non-trivial pile of changes (albeit, I think, not overly complex either). I could take a stab at it and post a PR but it will take me a bit of time to get right, and I don't want to step on any toes or work at cross-purposes with any existing plans you might have for this. |
The main problem (from experience) with trying to clean up GAP's error handling is that it's quite easy to end up breaking how GAP's break loop currently works. A giant pile of changes is extremely likely to break something important for some user or package, and we can't do a "big bang" change to packages. Of course, I'm also happy to see work in this area, but I'd definately write out a high-level plan of how you think the behaviour, both internally and externally, would change, before you starting writing code. |
That's not a big problem because for library use the break loop is effectively irrelevant--undesired even. Most of said giant pile of changes would actually be somewhat limited in scope I think, but would just require changes in a few dozen places in the kernel code, but would barely touch package code. The only two packages I found that really muck with error handling insofar as they mess with the I think that what I have in mind is a still just a tad too fuzzy to give a precise high-level plan. I think what I might do if nobody minds is to hack on it a bit and work out the details a bit better, and then write up a "high-level plan" retroactively once I think I have something that will work. But to be clear, I don't think I will be proposing very vast, sweeping changes: This will mostly be in service just to the libgap API level of things. |
To give a very fuzzy overview:
That's a very high-level, very fuzzy overview of what I have in mind to do. The details will emerge better once I start tooling around in there... |
Just wanted to mention that things got quite a bit better recently, with the introduction of GAP_TRY/GAP_CATCH and a lot of refactoring. Not saying it is perfect now, but some weird corner cases in code were resolved, and some semantics and assumptions etc. cleaned up |
Thanks. Perhaps we can use this in Sage's libgap interface. |
@dimpase you shouldn't have to if you are using |
I consider this resolved for the time being, in so far as that the code is now (IMHO) much cleaner and clearer and saner. That is, I don't claim it's bug free (I am sure it is not); however, in order to improve it, we need more concrete information on actual issues, as I already fixed the hypothetical ones I could think of ;-). |
Consider the following code:
with
gap-4.9.1
:with
master
I have traced this down to the following cause:
Ultimately
PRINT_OR_APPEND_TO_STREAM
is called when outputting the error message to the output log stream, and it contains aTRY_READ
wrapper. In 4.9.1STATE(NrError)
is>0
at this point, soTRY_READ
immediately bails. Inmaster
@fingolfin moved the increment ofSTATE(NrError)
to after the error message is printed, so the problem doesn't occur.I suspect this change was not made intentionally to prevent this bug (after all GAP should not be crashing). I still don't understand the
TRY_READ
mechanism well enough to know what the correct fix is, probably @fingolfin knows better.Incidentally
PRINT_OR_APPEND
does not wrap the same code intoTRY_READ
as the stream variant...The text was updated successfully, but these errors were encountered: