Sentry produces incorrect data when async code is involved

- [x] Review the documentation: https://docs.sentry.io/
- [x] Search for existing issues: https://github.com/getsentry/sentry-javascript/issues
- [x] Use the latest release: https://github.com/getsentry/sentry-javascript/releases

## Package + Version

- [x] `@sentry/node`

### Version:

```
6.13.3
```

## Description

When dealing with `async` code sentry fails to properly record data. Not only does it fail to record data, it also constantly adds bad data. 

**Exception 1**: The problem is somewhat hidden when you deal with a context that sits in a sandbox environment which only exists for the duration of ONE AND ONLY ONE transaction. The most popular example would be a browser session, but since we're speaking node scenarios the other example would be a "cloud function" (ie. pops into existence per request, and only ever handles hopefully the one transactions); there's also the scenarios that use _actual threads_ (which are rare in the current ecosystem). Needless to say, those 3 I just mentioned are also technically only fine so long as they use only one transactions; if more then one is used for any reason they're suddenly going to have bad data.

**Exception 2**: The problem is also hidden when transactions are completely dependent on user input (eg. user has to click some navigation item) and also happen faster then user input. This creates an artificial serial nature to them, hiding the fact that if more then one run at once they botch things since no more then one would ever run at once. _However, the more complex the system, the less likely this exception will save you._

The problem is most evident and most aggravating in the scenario of a http server (which arguably covers a lot of the use cases for sentry with nodejs). Since a http server will pickup requests and work on them as-they-come, regardless of if the previous one is finished, sentry has to work perfectly with 2, 3, 1000 transactions running simultaneously. 

In general, with your average http server (and similar use cases), it's your job to keep your stuff in one request out of the others. This isn't particularly hard normally since all it means is: don't use globals. _Unfortunately sentry seems to do just that, for everything._

I've read though maybe 50 issues, and this problem has been brought before under various forms, some more direct then others. The solution has also been mentioned several times. And I've also seen the reply of "works for us" with the not-very obvious subtext "in this environment where only one transaction would ever exist before the process exists".

So, I'll make this very very simple, by giving you some code you can easily run (and is in accordence to what your documentation, responses to issues, etc are saying is the "correct way") along with screenshots of your dashboards with datapoints that are as obvious as I can possibly make them for you.

---

There will be only one file, so just copy all the blocks in a `main.mjs` file; `.mjs` is the current nodejs of saying the file uses modern imports.

First some basic boilerplate.
```javascript
import Sentry from '@sentry/node';

const SENTRY_DSN = ""; // add your test project DSN here

Sentry.init({

	dsn: SENTRY_DSN,

	release: 'v1', // increment to get clean test sample

	// Set tracesSampleRate to 1.0 to capture 100%
	// of transactions for performance monitoring.
	// We recommend adjusting this value in production
	tracesSampleRate: 1.0

});
```

Next we'll add a function to help simulate "real world long async operations"

```js
async function delay(time, fn) {
	return new Promise((resolve) => {
		setTimeout(async () => {
			resolve(await fn());
		}, time);
	});
}
```

Next we'll add a function that simulates all the breadcrumbs sentry adds whenever some random 3rd party library is used god knows where in dependencies. The parameter `theRealTransaction` is only used to make it obvious what the breadcrumbs and spans _actually originate from_.
```js
async function randomThirdPartyIntegration(theRealTransaction) {

	// eg. "Got" library or whatever you guys are doing that's capturing HTTP

	// This context intentionally doesn't have access to objects in parameters
	// or other methods since that's how it typically happens in the real world.

	let hub = Sentry.getCurrentHub();
	let scope = hub.getScope();
	let transaction = scope.getTransaction();

	let span;
	if (transaction) {
		span = transaction.startChild({
			op: 'http_req',
			description: `http://google.com (real transaction: ${theRealTransaction})`
		});
	}

	scope.addBreadcrumb({
		category: 'http',
		message: `REQ http://google.com [Start] (real transaction: ${theRealTransaction})`
	});

	return await delay(300, () => {
		scope.addBreadcrumb({
			category: 'http',
			message: `REQ http://google.com [End] (real transaction: ${theRealTransaction})`
		});
		if (span) {
			span.finish();
		}
	});

}
```

Next we'll create a function that simulates what may happen in typical http server (or whatever else) request handle; we'll call it `exampleInput` since it will be the only function that takes input in theory (our `randomThirdPartyIntegration` is also a dependency of it).

_Edit 1: Original function had incorrect tag code. (As such tags errors were ignored, does not affect rest of issue) Scope has been added to exception handler, to be even more correct. (Still produces same problems)_
```js
async function exampleInput ({
	name, // the name of this transaction
	user, // the user for the transaction
	throwsError,
	timers: { // delays between actions
		action1,
		action2,
		action3
	}
}) {

	const tagName = name.replace('#', '');

	const hub = Sentry.getCurrentHub();
	const scope = hub.pushScope();

	scope.setUser({ username: user });

	const transaction = hub.startTransaction({
		op: "test",
		name: `Test Transaction ${name}`,
	});

	scope.setSpan(transaction);

	try {

		// Tag

		scope.setTag(`T${tagName} Tag 1`, 'yes');

		// Action 1

		let span1 = transaction.startChild({
			op: 'action',
			description: `Transaction ${name} Action #1`
		})

		try {
			await delay(action1, () => {
				scope.addBreadcrumb({
					category: 'step',
					message: `Transaction ${name} Action #1`
				});
			});
		}
		finally {
			span1.finish();
		}

		// Tag

		scope.setTag(`T${tagName} Tag 2`, 'yes')

		// Action 2

		let span2 = transaction.startChild({
			op: 'action',
			description: `Transaction ${name} Action #2`
		})

		try {
			await delay(action2, () => {
				scope.addBreadcrumb({
					category: 'step',
					message: `Transaction ${name} Action #2`
				});
			});
		}
		finally {
			span2.finish();
		}

		// Random Third Party Dependency

		let span3 = transaction.startChild({
			op: 'third_party',
			description: `Transaction ${name} Third-Party Thing`
		});

		try {
			await randomThirdPartyIntegration(name);
		}
		finally {
			span3.finish();
		}

		// Tag

		scope.setTag(`T${tagName} Tag 3`, 'yes')

		if (throwsError) {
			throw new Error("Test Error: Something went wrong");
		}

		// Action 3

		let span4 = transaction.startChild({
			op: 'action',
			description: `Transaction ${name} Action #3`
		});

		try {
			await delay(action3, () => {
				scope.addBreadcrumb({
					category: 'step',
					message: `Transaction ${name} Action #3`
				});
			});
		}
		finally {
			span4.finish();
		}

		// Tag

		scope.setTag(`T${tagName} Tag 4`, 'yes');

	}
	catch (err) {
		Sentry.captureException(err, scope)
	}
	finally {
		hub.popScope();
		transaction.finish();
	}

}
```
And, finally we'll simulate some transactions. I've included a Timeline, but be aware that the timeline isn't something that will necessarily happen guaranteed as it suggests, its simply where the current settings in the example actually align if plotted on a timeline.
```js

// The following simulates what a basic http server would do
// We don't wait for one transaction to finish before other starts because
// no sane server in the world does that sh*t

// a1 = action1, a2 = action2, a3 = action3
//
// Timeline (ms)      0        300  400  500  600  700  800  900  1000  1100
// -----------------------------------------------------------------------------
// Transaction 1      start              a1             a2              a3
// Transaction 2      start    a1             a2             a3
// Transaction 3      start         a1             a2             a3
// Transaction 3      start                   a1   a2 -> err

exampleInput ({
	name: '#1',
	user: 'transaction1_user',
	throwsError: false,
	timers: {
		action1: 500,
		action2: 300,
		action3: 300
	}
})

exampleInput ({
	name: '#2',
	user: 'transaction2_user',
	throwsError: false,
	timers: {
		action1: 300,
		action2: 300,
		action3: 300
	}
})

exampleInput ({
	name: '#3',
	user: 'transaction3_user',
	throwsError: false,
	timers: {
		action1: 400,
		action2: 300,
		action3: 300
	}
})

exampleInput ({
	name: '#4',
	user: 'transaction4_user',
	throwsError: true,
	timers: {
		action1: 600,
		action2: 100,
		action3: 100
	}
})
```
To run this just do. (after you provide the SENTRY_DSN) 
```
node ./main.mjs
```
You can also change the release to have a clean copy you can filter to in transactions.

---

Now lets look at what sentry has to say after running it once. I'll highlight all the mistakes I can spot with red arrows (there may be more).

![transactions](https://user-images.githubusercontent.com/338910/137601443-c46dd15a-abb0-412b-b94b-87dd12f36329.jpg)

In the above transaction 1 doesn't have a user, even though we set user on every transactions. 

![transaction1](https://user-images.githubusercontent.com/338910/137602126-5fa297ce-c84d-4165-a07b-ad516b027522.jpg)

Above is Transaction 1. We're missing the spans generated by the `randomThirdPartyIntegration` function itself, and we're missing all breadcrumbs.

![transaction2](https://user-images.githubusercontent.com/338910/137602483-f1851546-e6ca-4af1-b989-e57b695c3688.jpg)

Second transaction is also incorrect.

![transaction3](https://user-images.githubusercontent.com/338910/137602485-b3c08f00-02d2-4606-a3f2-a6e59bb83af8.jpg)

Third transaction somehow got wrong user and for some reason managed to miss the 3rd action breadcrumb. 

![transaction4](https://user-images.githubusercontent.com/338910/137602487-2ba8da42-de40-4120-a558-d0fc51079fb8.jpg)

Fourth transaction also got wrong user and also has 3rd party spans from transaction 2 and 3.

![error](https://user-images.githubusercontent.com/338910/137602490-07ea8fd9-4496-4f90-ab0d-368fdc5d9a5c.jpg)

The 4th transaction was set to also throw a error. Above we can see what we get in the error. Why are we getting in the error breadcrumbs that never appear in the transactions themselves?

---

Anyway, so how is this fixed? 

Technically speaking there are three approaches. 

The first is to use is to use [Async Context Tracking](https://nodejs.org/api/async_context.html#async_context_class_asynclocalstorage) that is made available as stable api from nodejs 16 onwards. This doesn't solve all the problems with how (from a quick glance at the code) sentry seems to work, but it does mean sentry is able to consistently get the real context with out the use of globals. Specifically the only parts it doesnt' solve are how sentry currently works with scope; which is not even clear to me how exactly this "ever merging" scope is suppose to work with what is essentially "isolated contexts" (eg. http requests, work queues, etc)

The second method will work in all versions of node, including prior versions (and people paranoid about perf impact of async context tracking), and the method is to just REQUIRE every operation pass the transaction object. So things like `Sentry.captureException(err)` should really be `Sentry.captureException(err, transaction)` where the transaction is ignored if null (and I mean IGNORED, no guessing of what it might be, there is none). Scope should just be `transaction.getScope()`.

The third method, and this is purely only for these special cases that sentry devs claim "works for them" is to have a method `Sentry.setMasterTransaction` that works as follows: if no transaction is set, set the transaction as the master; if a master is already set and not finished then error out and invalidate the master and the current transaction; if the master is set but finished (not counting transport costs), replace the current master with the new transaction. Essentially, you can have your "global transaction" but sentry guarantees you won't shoot yourself in the foot with it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Sentry produces incorrect data when async code is involved #4071

Package + Version

Version:

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Sentry produces incorrect data when async code is involved #4071

Description

Package + Version

Version:

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions