Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Hub & Scope in SDKs #122

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
305 changes: 305 additions & 0 deletions text/0122-sdk-hub-scope-merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,305 @@
- Start Date: 2023-11-07
- RFC Type: feature
- RFC PR: https://github.com/getsentry/rfcs/pull/122
- RFC Status: draft

# Summary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this RFC is no longer being updated, we're writing down how this works and collecting learnings etc. here: https://develop.sentry.dev/sdk/hub_and_scope_refactoring/


We want to streamline Hub & Scope usage for the SDK.

# Motivation

Generally, Hubs and Scopes are a very Sentry-specific concept, and can be a bit hard to explain.
Also, how Hub & Scope forking behaves is currently a bit tricky, and does not play well e.g. with how OpenTelemetry contexts work.

This RFC aims to streamline this by merging the concepts of Hub and Scope into a singular Scope concept.

It also proposes the new concepts of global & isolated scopes.

# Background

TODO

# Proposed Solution

Hubs will be removed as a concept for SDK users.
Instead, from a users perspective only Scopes remain, which will become the singular entity to hold context data etc.

Scopes will be _similar_ to how they work today, but not entirely the same.
Scopes can have data (e.g. tags, user, ...) added to them the same way you can do today.
This RFC _does not_ aim to change any of the data that is kept on the scope and is applied to events.

The following APIs will be removed/deprecated:

* `getCurrentHub()`
* `configureScope()` (instead just get the scope and set on it directly)
* Any APIs currently on the Hub only:
* `hub.pushScope()`
* `hub.popScope()`
* `hub.isOlderThan()`
* `hub.bindClient()`
* `hub.getStack()`
* `hub.getStackTop()`
* `hub.run()` (use `withScope()` instead)
mydea marked this conversation as resolved.
Show resolved Hide resolved

Instead, we will introduce some new APIs:

```ts
// get the currently active scope. replacement for `getCurrentHub().getScope()`
export function getScope(): Scope;

// get the currently active client. May return a NOOP client. Replacement for `getCurrentHub().getClient()`.
export function getClient(): Client;

// make a scope the current scope. Replacement for `makeMain(hub)`
export function setCurrentScope(scope: Scope): void;

// get the currently active global scope
export function getGlobalScope(): Scope;

// get the currently active isolation scope
export function getIsolationScope(): Scope;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some SDKs may also need a setter for isolation scope. In Java some frameworks do not play nicely with the callback based approach and instead we're just being told that something's about to happend and something just happened (e.g. changing threads).


// similar to `withScope`, but defines an isolation scope
export function withIsolationScope(callback: (scope: Scope) => unknown): unknown;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what those mean to do tbh. Sounds like it'd be the same as getScope and withScope.

```

The following APIs already exist but will behave differently:

* `withScope()` will still work, but it will actually fork an execution context. So this will roughly do the same as doing `hub.run()` today in languages that have that, which forks an execution context.

APIs that are currently on the hub should instead be called directly on the scope (e.g. `scope.captureException()` etc.), or via a global method (e.g. `Sentry.captureException()`).

The current scope may be kept similar to how we currently keep the current hub, but this is SDK specific and not part of this RFC.

## Clients

Instead of a client being optional, there will now _always_ be a client. It may be a Noop Client that does nothing, if `init()` has not been called yet.

A scope has a reference to a client. By default it will reference a noop client. You can bind a client to a scope via `scope.setClient()`.
The client is inherited by forked scopes.
mydea marked this conversation as resolved.
Show resolved Hide resolved
mydea marked this conversation as resolved.
Show resolved Hide resolved

```js
const client1 = new Client();
const scope = new Scope();

scope.getClient(); // <-- returns a noop client by default

scope.setClient(client1);
mydea marked this conversation as resolved.
Show resolved Hide resolved
scope.getClient(); // <-- returns client1
```

The current scope may be kept similar to how we currently keep the current hub, but this is SDK specific and not part of this RFC.
mydea marked this conversation as resolved.
Show resolved Hide resolved

When calling `getScope()` before a scope was made the current one (=before init was called), we will return a scope for a noop client.
mydea marked this conversation as resolved.
Show resolved Hide resolved
A noop client is a regular client that simply does not send anything.

This way, the API for `getClient()` can always return a client, and users do not have to guard against this being undefined all the time.
We may also expose a util like `sentryIsInitialized()` that checks if the current client is a Noop client (which currently you could have checked as `getCurrentHub().getClient() === undefined`).

If you want to have multiple isolated clients, you can achieve this easily with this new setup:

```js
const client1 = new Client();
mydea marked this conversation as resolved.
Show resolved Hide resolved
const client2 = new Client();

const scope1 = new Scope();
const scope2 = new Scope();

scope1.setClient(client1);
scope2.setClient(client2);

scope1.captureException(); // <-- isolated from scope2
```

## Scopes

Scopes behave similar to how they behave today.
When a scope is forked via `withScope()`, a new scope is created that inherits all data currently set on the parent scope.

The main change to Scopes is that they do not push/pop anymore, but instead fork an execution context (in languages where this makes sense/is possible).
Basically, `withScope()` should behave like `hub.run()` does today in languages that have execution context forking.

`client.getScope()` should return the current scope of this client in the current execution context.
mydea marked this conversation as resolved.
Show resolved Hide resolved

From a users perspective, this should mostly not be noticeable - they can always run `getScope()` to get the current scope, or `withScope(callback)` to fork a new scope off the current scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a major pain point in the past: withScope() was used in places where we actually wanted to change the current scope, but instead created a fork and then people were wondering why the changes were not visible on the original scope. Not sure if feasible within this RFC, but renaming withScope() to forkScope() would help here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong feelings about withScope, interested to hear what others think on this! I'd be fine with either.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find forkScope better, also considering this:

You can still clone scopes manually the same way as before, e.g. via Scope.clone(oldScope) or a similar API. In contrast to withScope(), this will not fork an execution context.

Both essentially create a new scope off of an existing one, but in different ways. It'd be good to have the name communicate what the difference is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason to keep withScope is to avoid breaking as much as possible. We could also consider keeping withScope (deprecated?) and adding forkScope as an alias.

For reference only, OTEL uses context.with(callback) to do the same, so it seems we are not the only ones using this wording. Doesn't mean we have to keep it, just pointing this out :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't withScope() mainly used to create a temporary scope which is only applied to the callback?
I don't see a good reason to rename/remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO generally we should strive to replace this usage of withScope - only to add a temporary event processor - with different pattern(s). (not possible for all cases, but most of them). So generally there should be little (not none) need to use it manually!


You can make a scope the current one via `setCurrentScope(scope)`, which should bind the scope to the current execution context (or a global, in SDKs without execution context). This is a replacement for the current APIs like `makeMain(hub)` or `setCurrentHub(hub)`.

You can still clone scopes manually the same way as before, e.g. via `Scope.clone(oldScope)` or a similar API. In contrast to `withScope()`, this will _not_ fork an execution context.

You can update the client of a scope via `scope.setClient(newClient)`. This will not affect any scope that has already been forked off this scope, but any scope forked off _after_ the client was updated will also receive the updated client.

Every scope is always tied to a client.
mydea marked this conversation as resolved.
Show resolved Hide resolved

## Global Scope

In addition to the currently active scope, there will also be a new special scope, the **Global Scope**.
The global scope is _not_ the initial scope, but a special scope that belongs to a client and is applied to any event that belongs to this client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make the client stateful all of a sudden. Not the end of the world, but feels odd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily. This does not mean that the global scope is defined on the client, but that the global scope has a reference to the client :) the mechanism to look up the global scope can/should probably be more in parallel to getting the active scope. Think of it like this:

const executionContext1; // this is one execution context somewhere
const executionContext2; // this is some other execution context somewhere else

executionContext1.globalScope --> Scope
executionContext2.activeScope --> Scope
executionContext3.isolationScope --> Scope

So all three of these scopes can be put on the execution context (or wherever the active scope is put). How exactly this is done would be up to the SDK/language, but the idea is that the client remains stateless.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also initially confused by this, but I think it's just a matter of wording this a bit differently. It says:

but a special scope that belongs to a client and is applied to any event that belongs to this client

This makes it sound like the client owns the scope, even though technically it's the other way around. Maybe we can write something like

Suggested change
The global scope is _not_ the initial scope, but a special scope that belongs to a client and is applied to any event that belongs to this client.
The global scope is _not_ the initial scope, but a special scope that has a client and is applied to any event that belongs to its client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky thing to convey (and actually, we need to specifically spec this out), is that each client has a separete global scope.

At least, I think this is what makes most sense 🤔 When talking with @mitsuhiko both options are possible (global scope is truly global, or global scope is basically a "client scope").

Personally I tend to the idea that each client should have it's own global scope - simply because thinking about e.g. MFE and similar, we probably don't want to "pollute" other clients when doing getGlobalScope().setTag().

Wording wise, we could say:

The global scope is not the initial scope, but a separate, special scope. The global scope is different for each client, and is applied to every event.

Or something like this?

mydea marked this conversation as resolved.
Show resolved Hide resolved

You can get the current global scope via `getGlobalScope()`. There _may_ be a function `setGlobalScope(scope)` to update the global scope - or SDKs can decide that there is no need to update the global scope, you can only mutate it.
mydea marked this conversation as resolved.
Show resolved Hide resolved

If you call `getGlobalScope()` before a client is initialized, we should still get a global scope back (tied to a Noop client). Once an actual client is initialized, the global scope of the noop client should be merged into the new global scope for the new client. This should ensure that even if you call `getGlobalScope().setTag(...)` before the SDK is initialized, no data is lost.
mydea marked this conversation as resolved.
Show resolved Hide resolved

The reason that the global scope is not the same as the initial scope of a client, is that you cannot accidentally mutate it - nothing ever inherits off the global scope.

## Isolation Scopes

Furthermore, there can also be **Isolation Scopes**.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is really not clear from reading the whole RFC to me why we have both

  • active scope
  • isolation scope

please make it clearer when to use what

Similar to the global scope, these are also applied to events. However, isolation scopes can be created, either by us internally (the most common scenario), or also by users. The new APIs for this are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client already holds the global scope, and now it also holds an isolation scope. But we do not expect to create a new client for each request, so the client should hold a stack of isolation scopes IMO, which will be tricky. Can we somehow clarify how this all should work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment above hopefully explaining the idea here - was probably not fully clear from how I wrote this down in the RFC. TLDR: active scope, global scope, isolation scope should be stored "next to each other" on each execution context. nothing should be stored directly on the client, which should remain stateless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Similar to the global scope, these are also applied to events. However, isolation scopes can be created, either by us internally (the most common scenario), or also by users. The new APIs for this are:
Similar to the global scope, these are also applied to events. However, isolation scopes can be created, either by us internally (the most common scenario), or also by users. Please note, creating too many isolation scopes would render the concept obsolete, so try to fork isolation scopes sparingly. The new APIs for this are:


```js
// Returns the currently active isolation scope.
export function getIsolationScope(): Scope;

// Create a new isolation scope for this scope
// This will NOT make this scope the isolation scope, but will create a new isolation scope (based on the currently active isolation scope, if one exists)
scope.isolate();

// Similar to `withScope`, but it forks a new scope AND sets a new isolation scope for this context
export function withIsolationScope(callback: (scope) => void): void;
```

You can fetch the currently active isolation scope via `getIsolationScope()`. You can define a new isolation scope via `scope.isolate()`, which will define a new isolation scope for this scope, and for all scopes that will be forked off this scope. When a client is created & bound, an initial isolation scope will immediately be created, similar to the global scope for a client.

An isolation scope is attached to the current execution context, similar to the active scope. There is always exactly one active isolation scope. If you call `getIsolationScope()` before a client has been created, a noop isolation scope is returned, which should be merged in once a client is actually created (same as with the global scope).
mydea marked this conversation as resolved.
Show resolved Hide resolved

Similar to the global scope, an isolation scope is always a separate scope, so nothing will inherit off it - except for a potential superseding isolation scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this. Can you explain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the Similar to the global scope, an isolation scope is always a separate scope, so nothing will inherit off it - except for a potential superseding isolation scope. part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a graphic to hopefully explain this better :D

If an isolation scope is created, and there is already an isolation scope in the current execution context, then the new isolation scope should be forked off the previous one (with copy-on-write).

### When to create an isolation scope

For most server-side SDKs, an isolation scope will be created for each request being processed.
Roughly, it will equate to each time we currently fork a hub.
Copy link

@sentrivana sentrivana Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just fork the current active scope? What does the isolation scope offer compared to just setting the active scope? A clarification here would be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the answer here is that adding tags, breadcrumbs, setting user etc. might get lost due to the user setting them on a scope that'll be lost and thus won't apply to sibling scopes.

// ... request incoming
const requestScope = getCurrentScope().fork() // whatever the correct syntax here would be

// some part of handling the request
val somePartScope = requestScope.fork()
setCurrentScope(somePartScope);

// somewhere her the user sets tags, breadcrumbs and user which is applied to events happening within the same scope or child scopes

// a different part of handling the request
val otherPartScope = requestScope.fork()
setCurrentScope(otherPartScope);

// tags, breadcrumbs and user are not applied for things happening here

The same example using isolation scope would allow the user to set tags, breadcrumbs, user etc. for the whole request without having to be in the right place to do so (i.e. gain access to the requestScope instead of a child scope). In other words isolation scope just seems to be an explicit concept for adding things to a scope specific for a request or whatever other things we wanna use it for (e.g. cron execution).

This behaviour would happen e.g. when we switch threads where today we clone the hub onto the other thread but when execution comes back the hub from that thread doesn't affect the rest of the request handling part. In the future instead of cloning the hub we'd be forking execution context + scope I presume.


### Examples for isolation scopes

Example for instrumentation that we would write:

```ts
function wrapHttpServerRequest(original: Function): Function {
// Fork an execution context for this server request, that is isolated
return Sentry.withIsolatedScope((scope) => {
// anything in here will have the same isolated scope!
return original();
})
}
```

Example for hooking into external auto-instrumentation (e.g. OpenTelemetry):

```ts
let onRequestHook: (span: Span) => void;

// This method is not defined by us, but is some external code
// Here just for demonstration purposes of how that may be implemented
function otelWrapHttpServerRequest(original: Function): Function {
// Fork an execution context for this server request,
Copy link
Member

@sl0thentr0py sl0thentr0py Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the semantic we are trying to achieve with isolate? is it per execution context? then why are we using 'fork an execution context' here? sorry but it really isn't clear to me what we're trying to achieve with this example. What kind of data lives on the withScope and what happens after you do scope.isolate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't really understand this example here. So is otelWrapHttpServerRequest some generic method, that's not defined by us? Why is that able to run withScope then if it has nothing to do with Sentry? Is this a generic OTEL hook that always triggers when a new span is created and automatically forks the current scope via withScope?

I'm unable to find scope.isolate() in JS implementation (https://github.com/search?q=repo%3Agetsentry%2Fsentry-javascript+%22isolate%22&type=code). I have another comment further up about it as well where I'm unsure what it should do.

It looks like this has been implemented using getIsolationScope.clone() instead of scope.isolate() here https://github.com/getsentry/sentry-javascript/blob/b0752b5d3e0840f3359688c58619fb0f638470bc/packages/opentelemetry/src/contextManager.ts#L53

Should we remove it if it's not implemented and unclear?

// but without isolating this!
return Sentry.withScope((scope) => {
onRequestHook(trace.getActiveSpan());
return original();
});
}

// This would be our custom sentry configuration
onRequestHook = () => {
const scope = getScope();
scope.isolate(); // Add an isolation scope to the already forked scope
}
```

## Applying scopes

Scopes are applied in this order to events:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the example on how scopes are merged I'm wondering what happens the other way round when scopes are read? E.g. Would Sentry.getCurrentScope().getRelease() return a value when the release was set on the global scope ~ Sentry.getGlobalScope().setRelease("1.0.0")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question! I would say it should only return the value on the actual scope being accessed. If you want to have some cascading behavior it should be a top-level method like Sentry.getRelease() or something like this?


```ts
class Scope {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, clients will not be aware that there is a scope. The Scope class merges data into the event before passing it to the client. Did I get this correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the idea, this is also how it works today (at least in JS and I think more or less everywhere?) - this RFC does not propose to change this from it is today, except that the special scope data is also applied to the events!

public captureEvent(event: Event, additionalScope?: Scope) {
// Global scope is always applied first
const scopeData = getGlobalScope().getScopeData();

// Apply isolations cope next
const isolationScope = getIsolationScope();
merge(scopeData, isolationScope.getScopeData());

// Now the scope data itself is added
merge(scopeData, scope.getScopeData());
mydea marked this conversation as resolved.
Show resolved Hide resolved

// If defined, add the captureContext/scope
// This is e.g. what you pass to Sentry.captureException(error, { tags: [] })
if (additionalScope) {
merge(scopeData, additionalScope.getScopeData());
}

// Finally, this is merged with event data, where event data takes precedence!
mergeIntoEvent(event, scopeData);
}
}
```

Note that there is _always_ exactly one global & one isolation scope active.

## What about environments that do not have isolation of execution contexts (e.g. mobile, browser)?

Where not useful, you simply don't have to use the isolation scope. But it's always there, if the need arises.
While it is empty it does nothing anyhow.

## What should be called from top level methods?

Top level APIs should generally interactive with the current active scope:

```js
Sentry.setTag();
Sentry.setUser();
Sentry.captureException();
// ...
```

The only exception is `addBreadcrumb()`. This should generally add breadcrumbs to the currently active isolation scope.
SDKs _may_ also add an option to the client to opt-in to put breadcrumbs on the global scope instead (e.g. for mobile or scenarios where you always want breadcrumbs to be global).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't isolation scope and global scope be the same usually on mobile anyways?


## Should users care about Clients?
mydea marked this conversation as resolved.
Show resolved Hide resolved

Generally speaking, for most regular use cases the client should be mostly hidden away from users.
Users should just call `Sentry.init()`, which will setup a client under the hood. Users should generally only interact with scopes, and we should keep clients out of most public facing APIs.

The client is only there to allow an escape hatch when users need to do more complex/special things, like isolating Sentry instances or multiplexing. So client APIs should be designed to _allow_ to do things that cannot be done via `Sentry.init()`, but our main focus should be on making the default experience easy to understand, which includes that users should not have to care about the concept of clients by default.

## What about other Hub references?

While the Hub is mainly exposed via `getCurrentHub()`, it is also used as argument or similar in many places.
These occurences should be updated to instead take a scope or a client.

## What about backwards compatibility?

We should strive to provide a wrapper/proxy `getCurrentHub()` method that still exposes the key functionality to ease upgrading. E.g.:

```js
import { getScope, getClient, captureException, withScope } from '../internals';

function getCurrentHub() {
return {
getClient,
getScope,
captureException,
withScope,
// ...
}
}
```

Based on the SDK, we can decide to keep _everything_ in this proxy (then we can do this even in a minor release),
or keep _most of it_ (if we do a major) - to break as little things in user land as possible.

## What about globals?

This RFC does not propose any concrete way to store the current scope. This is up to the concrete SDK and may behave the same way as it currently does for the hub, or differently if that makes more sense in a given scenario.

# Drawbacks

* This changes _a lot_ of public APIs and behavior.

# Unresolved questions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to figure out how scope sync across multiple layers works. For example, on RN we sync the scope to Cocoa, which syncs the scope to SentryCrash. I guess the SDKs should only sync the global scopes. We don't sync the scope from a lower layer, for example, Cocoa to RN.


TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TODO
- What is the user experience like if there is no instrumentation to set isolation scopes?