Make the TrioInternalError experience less confusing #1056

njsmith · 2019-05-17T04:06:28Z

We've had several bug reports recently where the filers basically said "I got this TrioInternalError, I know it's not a bug in Trio but rather 100% my fault for breaking things, but it said to file a bug so I'm doing that":

At one level this is fine – a very common mistake that programmers make is to assume that when something goes wrong, it's their fault, when it's actually a fixable bug in the software they're using. I'd rather get some ambiguous cases filed in our tracker than miss out on real bugs. #882 is a good example. In that case it turned out that the reporter was implementing the context manager protocol by hand and their implementation had a bug in it, so the immediate problem was a bug in their code, not in Trio – but this wasn't obvious at first.

Still, it seems like the error message could be clearer and more accurate.

Cases where you get TrioInternalError:

When there's a bug in Trio
When you raise an exception from a place where the docs explicitly say you aren't allowed to raise an exception (especially spawn_system_task, TrioToken.run_sync_soon, but also potentially from weird places like the abort callback in wait_task_rescheduled)
When you do something that violates Trio's assumptions, like calling __enter__/__exit__ methods in the wrong order.

Ideally, when people misuse the API, they should get an error telling them that, like a RuntimeError or TypeError or something, not a TrioInternalError. (For example, I guess #882 is now basically "can we give a more specific error message when people do weird things with cancel scope/nursery entry/exit".) But we can't do that 100% of the time. Sometimes we can detect that something weird happened that we didn't anticipate, but by definition this means we don't know why it happened – could be a bug, could be someone messing with us, there's no general way to tell. #1055 is an example – @oremanj diagnosed that it has mis-nested cancel scopes, but in general async_fn().send(None) is going to create very weird internal corruption (consider trio.hazmat.wait_readable(fd).send(None) – Trio will think that the task has suspended itself and should be resumed when the fd becomes readable, but in fact the task isn't suspended at all and is going to sleep somewhere else instead...).

The spawn_system_task and run_sync_soon and abort callbacks cases are slightly different: in all of these cases, an exception means that there's a bug in the code that used those APIs. And since these are basically internal APIs that are also public, then sometimes that user is Trio itself, and sometimes it's a user. Of course when they were created it was 100% internal users, because Trio had no users, so that was the case I optimized for :-).

So here's some possible action items:

Make the generic TrioInternalError message a little more accurate/informative. Maybe: "Trio encountered an unexpected error. This could be because someone did something unsupported, or it could be because of a bug in Trio. If you're not sure, please file a bug!"
Try to be more systematic about distinguishing between internal bugs in Trio versus API misuse. For example:
- If we could catch bad nesting of cancel scopes/nurseries, that would be great, since it seems to come up a lot. (And tell people to use @asynccontextmanager!)
- We could tweak the user-visible versions of spawn_system_task and run_sync_soon so that if user code crashes when it's not supposed to, you get a different error (not TrioInternalError). I'm not sure if this comes up often enough in practice to be worth worrying about though.

The text was updated successfully, but these errors were encountered:

Zac-HD · 2019-05-20T01:55:04Z

We could tweak the user-visible versions of spawn_system_task and run_sync_soon so that if user code crashes when it's not supposed to, you get a different error.

I'd be strongly in favor of this one! Getting comprehensible and actionable error messages is one of my favorite things about Trio, and "I just violated an invariant I didn't know about" is one of the times when it's most important. That this is rare IMO makes it more important, because it's so much easier to get stuck on without informative feedback.

My enthusiasm for detecting and warning (or erroring) on API misuse is much more general, and detecting nesting issues would be great, but this one seems like an easy win 😄

When exiting an outer cancel scope before some inner cancel scope nested inside it, Trio would previously raise a confusing chain of internal errors. One common source of this misbehavior was an async generator that yielded within a cancel scope. Now the mis-nesting condition will be noticed and in many cases the user will receive a single RuntimeError with a useful traceback. This is a best-effort feature, and it's still possible to fool Trio if you try hard enough, such as by closing a nursery's cancel scope from within some child task of the nursery. Fixes python-trio#882. Relevant to python-trio#1056, python-trio#264.

oremanj added debugging docs user happiness labels May 17, 2019

oremanj mentioned this issue May 21, 2019

Attempt to recover from cancel scope nesting violations #1061

Merged

njsmith mentioned this issue Jun 10, 2019

Test suite explodes if a nursery's __aexit__ doesn't get called python-trio/pytest-trio#82

Closed

njsmith mentioned this issue Jun 22, 2019

Proposal: make checkpoint_if_cancelled() sync-colored and rename it accordingly #961

Open

jtrakk mentioned this issue Nov 15, 2019

AssertionError when I exit() semi-inside a nursery #1305

Closed

This was referenced May 12, 2020

Two TrioInternalErrors #1132

Closed

TrioInternalError in quickstart echo_server_handler test #1324

Closed

njsmith mentioned this issue May 27, 2020

Maybe use a more efficient wakeup mechanism #1554

Open

njsmith mentioned this issue Jun 10, 2020

We need a mechanism to crash Trio with a given exception; how should it work? #1607

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the TrioInternalError experience less confusing #1056

Make the TrioInternalError experience less confusing #1056

njsmith commented May 17, 2019

Zac-HD commented May 20, 2019

Make the TrioInternalError experience less confusing #1056

Make the TrioInternalError experience less confusing #1056

Comments

njsmith commented May 17, 2019

Zac-HD commented May 20, 2019