🎨 [PANA-4372] Explicitly scope serialization state #3887

sethfowler-datadog · 2025-10-01T10:20:12Z

Motivation

I'm preparing to implement a prototype of a new DOM serialization algorithm behind a feature flag. To cleanly integrate this new algorithm into the existing code, we need to add a small amount of abstraction.

This PR focuses on adding an abstraction layer around node ids. Currently, the association between DOM nodes and their ids is tracked using global variables which persist across views; however, with the new algorithm, each view needs its own node id namespace. To abstract over this difference cleanly, it's helpful to get rid of the global variables and create an object to provide an explicit lifetime to this state.

Changes

This PR adds a new SerializationScope type which tracks state associated with a specific stream of serialized events. It also removes the global variables we use right now to track node ids and moves them onto a new NodeIds type which is exposed via SerializationScope. The new SerializationScope type is plumbed to the various places in the code which need it.

There is not yet a mechanism for resetting the state associated with the SerializationScope when the view changes; in this PR, the focus is on maintaining existing behavior as much as possible, and in any case today's algorithm would not use this functionality. This mechanism will be added in a future PR.

Future PRs will also add more state to SerializationScope. Some of this will be associated with the new DOM serialization algorithm, but there is also existing state with similar lifetime requirements (e.g. ElementScrollPositions) that would probably make sense to fold into SerializationScope.

Note that while this PR is intended to maintain existing behavior, it does actually change things slightly, because the lifetime of node ids is now tied to a particular invocation of record(). This means that stopping and starting recording will reset node ids. I think this is fine, and indeed desirable, but it's something to be aware of.

Checklist

Tested locally
Tested on staging
Added unit tests for this change.
Added e2e/integration tests for this change.

cit-pr-commenter · 2025-10-01T10:21:36Z

Bundles Sizes Evolution

📦 Bundle Name	Base Size	Local Size	𝚫	𝚫%	Status
Rum	162.41 KiB	162.41 KiB	0 B	0.00%	✅
Rum Recorder	19.34 KiB	19.92 KiB	+593 B	+2.99%	✅
Rum Profiler	4.89 KiB	4.89 KiB	0 B	0.00%	✅
Logs	56.02 KiB	56.02 KiB	0 B	0.00%	✅
Flagging	944 B	944 B	0 B	0.00%	✅
Rum Slim	119.37 KiB	119.37 KiB	0 B	0.00%	✅
Worker	23.60 KiB	23.60 KiB	0 B	0.00%	✅

🚀 CPU Performance

Pending...

🧠 Memory Performance

Pending...

🔗 RealWorld

datadog-datadog-prod-us1 · 2025-10-01T10:23:52Z

⚠️ Tests

⚠️ Warnings

❄️ 2 New flaky tests detected

cookie getCurrentSite caches the result from Safari 12.1.2 (Mac OS 10.14.6) (Datadog) (✨ Fix with BitsAI)
Expected spy cookie to have been called 2 times. It was called 0 times.
<Jasmine>
webpack:///packages/core/src/browser/cookie.spec.ts:50:43 <- /tmp/_karma_webpack_760665/commons.js:58713:49
<Jasmine>
cookie getCurrentSite returns the eTLD+1 for foo.bar.baz.example.com from Safari 12.1.2 (Mac OS 10.14.6) (Datadog) (✨ Fix with BitsAI)
Expected 'foo.bar.baz.example.com' to be 'example.com'.
<Jasmine>
webpack:///packages/core/src/browser/cookie.spec.ts:24:61 <- /tmp/_karma_webpack_760665/commons.js:58693:108
<Jasmine>

ℹ️ Info

🧪 All tests passed

🎯 Code Coverage
• Patch Coverage: 89.39%
• Total Coverage: 92.68% (-0.06%)

View detailed report

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 3d17171 | Docs | Was this helpful? Give us feedback!}

sethfowler-datadog · 2025-10-01T10:28:05Z

packages/rum/src/domain/record/serialization/serializationScope.ts

+
+export interface SerializationScope {
+  assignSerializedNodeId(node: Node): NodeId
+  getSerializedNodeId(node: Node): NodeId | undefined


There is no longer an overload of getSerializedNodeId() for NodeWithSerializedNode, and hasSerializedNode() has been removed. I find it cleaner to just call getSerializedNodeId() and check the result.

sethfowler-datadog · 2025-10-01T10:28:45Z

packages/rum/src/domain/record/serialization/serializationScope.ts

+export type NodeId = number & { __brand: 'NodeId' }
+
+export interface SerializationScope {
+  assignSerializedNodeId(node: Node): NodeId


SerializationScope now encapsulates the logic for assigning ids to nodes, so setSerializedNodeId() has been removed from the interface.

sethfowler-datadog · 2025-10-01T10:48:36Z

/to-staging

dd-devflow-routing-codex · 2025-10-01T10:48:39Z

View all feedbacks in Devflow UI.

2025-10-01 10:48:39 UTC ℹ️ Start processing command /to-staging

2025-10-01 10:48:47 UTC ℹ️ Branch Integration: starting soon, merge expected in approximately 11m (p90)

Commit 582515231d will soon be integrated into staging-40.

2025-10-01 11:02:37 UTC ℹ️ Branch Integration: this commit was successfully integrated

Commit 582515231d has been merged into staging-40 in merge commit 3d3b542260.

Check out the triggered pipeline on Gitlab 🦊

If you need to revert this integration, you can use the following command: /code revert-integration -b staging-40

…-state (#3887) into staging-40 Integrated commit sha: 5825152 Co-authored-by: sethfowler-datadog <seth.fowler@datadoghq.com>

packages/rum/src/domain/record/serialization/serializationScope.ts

yannickadam · 2025-10-01T13:54:09Z

packages/rum/src/domain/record/serialization/serialization.types.ts

  parentNodePrivacyLevel: ParentNodePrivacyLevel
  serializationContext: SerializationContext
  configuration: RumConfiguration
+  scope: SerializationScope


:nit: scope feels more fundamental than an option. Maybe it could be passed as a first level param? I would also argue that mandatory options are not options anymore :)

I feel the same way! In the version of this code for the new data format, I've totally eliminated this options object and just converted everything to actual function arguments. However, I did that in part by adding one more abstraction which isn't yet present in this PR. I can make the same changes for the old serialization algorithm, but if it's OK with you, let's defer that until the PR that adds that other abstraction. (It's an object representing a mutation "transaction", which is what a lot of these "options" are really associated with.)

Apropos of this discussion, this PR removes one of these options totally: #3889

2 cents: Options is used to designate named parameters here. I agree it might not be the best naming.

Yeah, that's totally fair; naming is always the hard part. 😄

I just pushed #3893 (based on this one, to avoid further rebasing) that pushes further forward on restructuring these options by converting parentNodePrivacyLevel into a normal function parameter. I'll open more PRs to try to further improve the organization of these options.

yannickadam · 2025-10-01T13:56:58Z

packages/rum/src/domain/record/serialization/serializeNode.ts

-  const id = getSerializedNodeId(node) || generateNextId()
+  const id = options.scope.assignSerializedNodeId(node)
  const serializedNodeWithId = serializedNode as SerializedNodeWithId
  serializedNodeWithId.id = id


We should change the type of SerializedNodeWithId.id from number to NodeID? Else the branded type loses some value.

I'd love to! It's a multistep process, though, since SerializedNodeWithId is a type generated from JSON Schema which lives in rum-events-format. We can probably use the tsType feature of json-schema-to-typescript to apply the branding there. I'll give it a try in a future PR in rum-events-format, and if everything looks good there, I'll update the code here to take advantage of it.

yannickadam · 2025-10-01T14:08:39Z

packages/rum/src/domain/record/trackers/trackMediaInteraction.ts

      mediaInteractionCb(
        assembleIncrementalSnapshot<MediaInteractionData>(IncrementalSource.MediaInteraction, {
-          id: getSerializedNodeId(target),
+          id,


Can assembleIncrementalSnapshot accept only an id of type NodeID, or does our link with rum events format prevents this?

As you say, the difficulty is that types from rum-events-format are used here. It should be possible using tsType, but it'll require some work in the context of that repo.

yannickadam · 2025-10-01T14:20:54Z

packages/rum/src/domain/record/trackers/trackStyleSheet.ts

+      return
    }
+    const id = scope.getSerializedNodeId(styleSheet.ownerNode)
+    if (id === undefined) {


Looks like we had a risk of executing the callback with undefined before. I've seen a couple places where the new code looks more defensive. I wonder => Are we just being extra cautious or do we believe these cases could happen?

You're right; it's not super likely for us to encounter an undefined node id in these situations today because node ids have an unbounded lifetime. As we transition to a world where node ids are a bit more scoped, and may get periodically reset, a slight increase in caution is probably warranted.

BenoitZugmeyer · 2025-10-02T13:45:26Z

packages/rum/src/domain/record/serialization/serialization.types.ts

  parentNodePrivacyLevel: ParentNodePrivacyLevel
  serializationContext: SerializationContext
  configuration: RumConfiguration
+  scope: SerializationScope


2 cents: Options is used to designate named parameters here. I agree it might not be the best naming.

BenoitZugmeyer · 2025-10-02T13:56:53Z

packages/rum/src/domain/record/record.ts

  const elementsScrollPositions = createElementsScrollPositions()
-
-  const shadowRootsController = initShadowRootsController(configuration, emitAndComputeStats, elementsScrollPositions)
+  const scope = createSerializationScope()


Question: are you going to reset the scope on each full snapshot?

Yes, that's right. Today, and in the near term future, that also means that we reset the scope on each view. (Longer term, I'm hoping that we end up being able to separate these two concepts.)

BenoitZugmeyer · 2025-10-02T13:59:19Z

packages/rum/src/domain/record/serialization/serialization.types.ts

  serializationContext: SerializationContext
  configuration: RumConfiguration
+  scope: SerializationScope


Suggestion: One might wonder what's the distinction between the serialization context and scope. Maybe rename scope to something more explicit like nodeRegistry? Or rename context to "state"? Or maybe put the scope in the context?

I want to do this kind of simplification, but I'm hoping to introduce one more concept first: an object representing a serialization "transaction". The general structure I'm thinking of this:

SerializationScope contains stuff which has a lifetime tied to a particular stream of serialized events. For today's serialization algorithm, that's node ids and statistics. The new algorithm also has the string table at this level.

SerializationScope has a method with a signature like captureMutations((transaction: SerializationTransaction) => BrowserRecord). We capture full snapshots, incremental mutations, and the like inside the callback to this method. It handles some generic things, like collecting timing statistics for telemetry, initializing the state of the SerializationTransaction, and passing the resulting record to a callback so it can be included in the segment. The new algorithm will do a bit of post-processing at this level, too.

SerializationTransaction contains stuff which has a lifetime tied to a particular invocation of the serialization code. Think of things like SerializeOptions#serializedNodeIds, SerializationContext#status, etc. It will also have a reference to the SerializationScope that generated it, so we don't have to thread both objects through the call tree everywhere. The new algorithm will have a version of SerializationTransaction with builder-like methods that add new content to the transaction (e.g. new serialized nodes, new serialized attribute mutations, etc etc), so that we will have a single place where we handle the "smart" aspects of the encoding, like converting literal strings into references to the string table.

I'm planning to introduce the skeleton of this structure in the next PR in the series, which I'll push today. Once we have this in place, we can start moving things from SerializationContext and SerializationOptions to new homes based on their lifetime: some will end up on SerializationScope, some on SerializationTransaction. SerializeOptions#parentNodePrivacyLevel will get converted to a normal function parameter, since it's an exceptional case that doesn't really belong on either of these objects; its natural scope is much more local.

I'm hoping this structure will simplify things a little bit overall, while also providing a foundation that both the old and new algorithms can sit on top of cleanly, despite their differences.

That sounds great, and I'm looking forward to that next PR. Clarifying the different concepts with some comments could be nice!

Yeah, totally agreed! I'll make sure that PR includes some solid comments.

BenoitZugmeyer · 2025-10-03T08:15:08Z

packages/rum/src/domain/record/serialization/serializationScope.ts

+export interface SerializationScope {
+  nodeIds: NodeIds
+}
+


Question: just for my understanding, what are you planning to add in the scope?

sethfowler-datadog requested a review from a team as a code owner October 1, 2025 10:20

sethfowler-datadog commented Oct 1, 2025

View reviewed changes

dd-mergequeue bot added a commit that referenced this pull request Oct 1, 2025

Integrate seth.fowler/PANA-4372-add-explicit-scoping-of-serialization…

3d3b542

…-state (#3887) into staging-40 Integrated commit sha: 5825152 Co-authored-by: sethfowler-datadog <seth.fowler@datadoghq.com>

dd-devflow bot added the staging-40 label Oct 1, 2025

yannickadam approved these changes Oct 1, 2025

View reviewed changes

sethfowler-datadog force-pushed the seth.fowler/PANA-4372-add-explicit-scoping-of-serialization-state branch from 8631c9d to 1d793ce Compare October 2, 2025 12:05

🎨 [PANA-4372] Explicitly scope serialization state

3d17171

sethfowler-datadog force-pushed the seth.fowler/PANA-4372-add-explicit-scoping-of-serialization-state branch from 1d793ce to 3d17171 Compare October 2, 2025 14:51

BenoitZugmeyer reviewed Oct 3, 2025

View reviewed changes

BenoitZugmeyer approved these changes Oct 3, 2025

View reviewed changes

sethfowler-datadog merged commit 3ab990f into main Oct 3, 2025
20 checks passed

sethfowler-datadog deleted the seth.fowler/PANA-4372-add-explicit-scoping-of-serialization-state branch October 3, 2025 11:20

sethfowler-datadog mentioned this pull request Nov 24, 2025

🎨 [PANA-4929] Add the concept of serialization transactions and reorganize state #3982

Merged

4 tasks

🎨 [PANA-4372] Explicitly scope serialization state #3887

🎨 [PANA-4372] Explicitly scope serialization state #3887

Uh oh!

Conversation

sethfowler-datadog commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Checklist

Uh oh!

cit-pr-commenter bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundles Sizes Evolution

Uh oh!

datadog-datadog-prod-us1 bot commented Oct 1, 2025 • edited by datadog-official bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sethfowler-datadog commented Oct 1, 2025

Uh oh!

dd-devflow-routing-codex bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sethfowler-datadog Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sethfowler-datadog commented Oct 1, 2025 •

edited

Loading

cit-pr-commenter bot commented Oct 1, 2025 •

edited

Loading

datadog-datadog-prod-us1 bot commented Oct 1, 2025 •

edited by datadog-official bot

Loading

dd-devflow-routing-codex bot commented Oct 1, 2025 •

edited

Loading

sethfowler-datadog Oct 3, 2025 •

edited

Loading