Fix `handled` data and differentiate between unhandled errors and crashes

### Problem Statement

_TL;DR: We send wrong data in the `handled` field, we conflate unhandled errors and crashes, and even if we fixed the former we'd still need to find effective ways to differentiate the latter from one another._

Stolen from my comment on https://github.com/getsentry/rfcs/pull/10:

> _Note (just so the second part doesn't get missed): Frustratingly, the GH UI gives no indication of this, but this quoted comment "continues below the fold" (i.e., is scrollable)._
>
> This is decidedly broken in JS, in a number of different ways.
> 
> On the browser side of things:
> - True crashes (the kind that land you here: chrome://crash/) are pretty rare. That's fortunate, because at that point the JS engine has gone down in flames (taking Sentry along with it) so we have no good way to record what's happened. (There's no publicly-available native browser API to capture such an event the way you can with, say, a minidump.)
> - Nonetheless, we mark some errors as unhandled, specifically those which bubble up to the [global handlers](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/globalhandlers.ts#L245). While it's true they haven't been handled (unless the user has add a global `onerror` handler themselves, which is not something we currently try to detect), they're also not crashes. (They show up in release health metrics that way, though.)
> - Errors which are caught by our auto-instrumentation ([1](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/trycatch.ts#L117), [2](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/trycatch.ts#L138), [3](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/trycatch.ts#L164), [4](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/trycatch.ts#L224), [5](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/integrations/trycatch.ts#L243)) are marked as handled, even though they have not, in fact, been handled by the user.
> - Events which _are_ handled by the user are correctly [marked as being handled](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/browser/src/eventbuilder.ts#L157).
> 
> So the data is wrong in many cases, but it's also kind of arbitrary and divorced from reality, in both directions: It's totally possible for some browser extension the user has installed to be blowing up the console with unhandled errors, causing all of the sessions in the main web app to be marked as crashed, even though the user is blissfully unaware that anything's wrong (and the erroring code has nothing to do with the app integrating sentry). It's also possible for, say, a click handler to error, the site to therefore become unresponsive to at least some user interactions, and for our auto-instrumentation to catch it and mark it handled (and therefore a healthy session), even though the user's experience is of an app which has frozen up.
> 
> On the node side, we have both the same (too much `handled`) and the opposite (too much `unhandled`) problem:
> - True crashes (anything which causes the main process to exit) do happen, but [only sometimes trigger the `beforeExit` event](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/sdk.ts#L262-L274). We also sometimes [purposefully exit the process](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/integrations/utils/errorhandling.ts#L11-L37).
> - Similar to the browser SDK, the node SDK marks [unhandled errors](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/integrations/onuncaughtexception.ts#L83) and [rejections](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/integrations/onunhandledrejection.ts#L54) which make it to the global handler as unhandled, but depending on `Sentry.init` settings, this [may](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/integrations/onuncaughtexception.ts#L58-L67) or [may not ](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/integrations/onunhandledrejection.ts#L74-L79) result in the process being force-exited. Also similar to browser-land, all other errors [are marked handled](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/node/src/eventbuilder.ts#L54-L57), whether they were caught by the user or by auto-instrumentation.
> - If the node SDK is being run inside the serverless SDK, though, [all errors are considered unhandled](https://github.com/getsentry/sentry-javascript/blob/584c4cbebde327123e9d97728b696a4872a6d248/packages/serverless/src/utils.ts#L12-L18), regardless of whether or not the user catches the error.
> 
> Ultimately, there are really three problems/challenges/considerations/whatever that we're dealing with:
> 
> - We'd like some way for the data model to distinguish between true crashes and unhandled errors (the main subject of this RFC).
> - As has been alluded to in other comments, making changes here has consequences for both the UI (the red 💀 badge) and downstream data (session statuses and the resulting crash-free rate for a given release).
> - We should be emitting values which better reflect reality (by itself not a hard change, but one which needs the above two issues to get figured out first).
> 
> As a result, anything we do is going to have to involve not only SDK folks but also product folks (together making a decision) and then design/UI folks, and possibly ingest folks as well (in order to fully implement any changes).
> 
> Relevant issues:
> https://github.com/getsentry/sentry-javascript/issues/5408
> https://github.com/getsentry/sentry-javascript/issues/5375

Other relevant bits of that thread:
https://github.com/getsentry/rfcs/pull/10#issuecomment-1267718900
https://github.com/getsentry/rfcs/pull/10#issuecomment-1289888297

### Solution Brainstorm

There are a few separate but intertwined threads of what has to happen to really fix this:

- We need the sentry server and UI to be able to handle three options rather than two.
  - Takes resources outside of our team and therefore commitment to actually make this better from a Decider™.
  - In the meantime, we agreed to start sending the correct data somewhere else in the issue, so we can run analytics on just how wrong things are, and so whatever future team picks up that side of things will have ready-made data to play with.

- We need to correct our blatantly incorrect booleans
  - [ ] Browser
  - [ ] Node
  - [ ] Other frameworks TBA

- We need to find effective ways to differentiate between unhandled errors which don't really block users and ones which actually break stuff (the closest we're going to get to actual "crashes" in browser-land, at least until a real minidump-like API is introduced for browser tabs - for obvious reasons [this](https://github.com/getsentry/rfcs/pull/10#issuecomment-1267718900) doesn't count).
  - Interesting ideas about midway through this deck on how to tell which errors negatively affect users: https://speakerdeck.com/bluesmoon/1000-javascript-errors-are-as-good-as-none
  - Is this a place where session replay could help?
  
Note: As per https://github.com/getsentry/sentry-dart/issues/1116#issuecomment-1308340171, we should coordinate with the mobile folks when we actually start working on this. EDIT: Turns out they can already do the data gathering for the dart SDK. Might nice to try sending correct data from RN, though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix `handled` data and differentiate between unhandled errors and crashes #6073

Problem Statement

Solution Brainstorm

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Fix handled data and differentiate between unhandled errors and crashes #6073

Description

Problem Statement

Solution Brainstorm

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Fix `handled` data and differentiate between unhandled errors and crashes #6073