Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the TrioInternalError experience less confusing #1056

Open
njsmith opened this issue May 17, 2019 · 1 comment
Open

Make the TrioInternalError experience less confusing #1056

njsmith opened this issue May 17, 2019 · 1 comment

Comments

@njsmith
Copy link
Member

njsmith commented May 17, 2019

We've had several bug reports recently where the filers basically said "I got this TrioInternalError, I know it's not a bug in Trio but rather 100% my fault for breaking things, but it said to file a bug so I'm doing that":

At one level this is fine – a very common mistake that programmers make is to assume that when something goes wrong, it's their fault, when it's actually a fixable bug in the software they're using. I'd rather get some ambiguous cases filed in our tracker than miss out on real bugs. #882 is a good example. In that case it turned out that the reporter was implementing the context manager protocol by hand and their implementation had a bug in it, so the immediate problem was a bug in their code, not in Trio – but this wasn't obvious at first.

Still, it seems like the error message could be clearer and more accurate.

Cases where you get TrioInternalError:

  • When there's a bug in Trio
  • When you raise an exception from a place where the docs explicitly say you aren't allowed to raise an exception (especially spawn_system_task, TrioToken.run_sync_soon, but also potentially from weird places like the abort callback in wait_task_rescheduled)
  • When you do something that violates Trio's assumptions, like calling __enter__/__exit__ methods in the wrong order.

Ideally, when people misuse the API, they should get an error telling them that, like a RuntimeError or TypeError or something, not a TrioInternalError. (For example, I guess #882 is now basically "can we give a more specific error message when people do weird things with cancel scope/nursery entry/exit".) But we can't do that 100% of the time. Sometimes we can detect that something weird happened that we didn't anticipate, but by definition this means we don't know why it happened – could be a bug, could be someone messing with us, there's no general way to tell. #1055 is an example – @oremanj diagnosed that it has mis-nested cancel scopes, but in general async_fn().send(None) is going to create very weird internal corruption (consider trio.hazmat.wait_readable(fd).send(None) – Trio will think that the task has suspended itself and should be resumed when the fd becomes readable, but in fact the task isn't suspended at all and is going to sleep somewhere else instead...).

The spawn_system_task and run_sync_soon and abort callbacks cases are slightly different: in all of these cases, an exception means that there's a bug in the code that used those APIs. And since these are basically internal APIs that are also public, then sometimes that user is Trio itself, and sometimes it's a user. Of course when they were created it was 100% internal users, because Trio had no users, so that was the case I optimized for :-).

So here's some possible action items:

  • Make the generic TrioInternalError message a little more accurate/informative. Maybe: "Trio encountered an unexpected error. This could be because someone did something unsupported, or it could be because of a bug in Trio. If you're not sure, please file a bug!"

  • Try to be more systematic about distinguishing between internal bugs in Trio versus API misuse. For example:

    • If we could catch bad nesting of cancel scopes/nurseries, that would be great, since it seems to come up a lot. (And tell people to use @asynccontextmanager!)
    • We could tweak the user-visible versions of spawn_system_task and run_sync_soon so that if user code crashes when it's not supposed to, you get a different error (not TrioInternalError). I'm not sure if this comes up often enough in practice to be worth worrying about though.
@Zac-HD
Copy link
Member

Zac-HD commented May 20, 2019

We could tweak the user-visible versions of spawn_system_task and run_sync_soon so that if user code crashes when it's not supposed to, you get a different error.

I'd be strongly in favor of this one! Getting comprehensible and actionable error messages is one of my favorite things about Trio, and "I just violated an invariant I didn't know about" is one of the times when it's most important. That this is rare IMO makes it more important, because it's so much easier to get stuck on without informative feedback.

My enthusiasm for detecting and warning (or erroring) on API misuse is much more general, and detecting nesting issues would be great, but this one seems like an easy win 😄

oremanj added a commit to oremanj/trio that referenced this issue May 21, 2019
When exiting an outer cancel scope before some inner cancel scope nested inside it, Trio would previously raise a confusing chain of internal errors.
One common source of this misbehavior was an async generator that yielded within a cancel scope.
Now the mis-nesting condition will be noticed and in many cases the user will receive a single RuntimeError with a useful traceback.
This is a best-effort feature, and it's still possible to fool Trio if you try hard enough, such as by closing a nursery's cancel scope
from within some child task of the nursery.

Fixes python-trio#882. Relevant to python-trio#1056, python-trio#264.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants