Skip to content

DeepCloneNode allocation runaway (44 GB / 308 s) for recursive tagged-tuple type alias #4014

@ebramanti

Description

@ebramanti

DeepCloneNode allocation runaway (44 GB / 308 s) for recursive tagged-tuple type alias

Summary

tsgo enters an allocation runaway through ast.NodeFactory.DeepCloneNode and printer.EmitContext.SetOriginalEx's LinkStore when type-checking a package whose type-reference graph hinges on a self-recursive, multi-variadic-tuple type alias and whose .tsbuildinfo is on disk. We observed 43.59 GB of allocations across 308.76 s of CPU time on ~10 cores in a single tsgo invocation (~40 GB RSS) before the process was killed externally. Deleting the .tsbuildinfo on the same source compiles in 1.8 s with normal allocation. Same family as the open issue #3378 (Crash, assigned to @jakebailey) and as the closed #2917 / #2987 (template-literal recursion, fixed) — but for a different recursive-type family (tuple/union with variadic tails) and a different surface (declaration-emit path, not LSP crash).

Environment

  • tsgo version: 7.0.0-dev.20260509.2
  • Platform: Darwin 25.5.0, arm64 (M-series Mac), 10 hardware threads
  • Codebase: Bun workspace monorepo. The affected package has ~50 source files referencing the recursive type alias; ~110 additional files across the monorepo import it transitively. tsgo resolves ~8.4k files into its compilation scope (workspace deps + their transitive node_modules).
  • tsconfig.json: incremental: true, noEmit: true, declaration: false, composite: false, isolatedDeclarations: false (no declaration emit requested anywhere in the monorepo).

Profiles

Captured with tsgo --pprofDir ./profile. Attaching four .pb.gz files:

File Description
teammate-memprofile.pb.gz The leak. 43.59 GB allocated, DeepCloneNode 74.7% cumulative.
teammate-cpuprofile.pb.gz The leak. 308.76 s @ ~10 cores; 90%+ in runtime.gcDrain / runtime.scanObject — GC-thrashed, not compute-bound.
control-memprofile.pb.gz Same source, .tsbuildinfo deleted before invocation. 3.22 GB total allocs, zero DeepCloneNode samples.
control-cpuprofile.pb.gz Same source, .tsbuildinfo deleted. 1.77 s total.

Memprofile — top cumulative (teammate, leak)

File: tsgo                          Type: alloc_space
Total: 43.59 GB                     Showing top of 151 nodes

      flat  flat%   sum%        cum   cum%
         0     0%     0%    32.56GB 74.70%  ast.(*NodeVisitor).VisitNodes
    1.94GB  4.45%  4.45%    32.56GB 74.70%  ast.(*NodeVisitor).VisitSlice
         0     0%  4.45%    32.55GB 74.68%  ast.(*Node).VisitEachChild (inline)
         0     0%  4.45%    32.55GB 74.68%  ast.(*NodeFactory).DeepCloneNode.getDeepCloneVisitor.func{1,2}
         0     0%  4.45%    32.55GB 74.68%  ast.(*TupleTypeNode).VisitEachChild
         0     0%  4.45%    32.55GB 74.68%  ast.(*UnionTypeNode).VisitEachChild
         0     0%  4.45%    32.31GB 74.12%  ast.(*NodeVisitor).VisitNode
         0     0%  4.45%    32.23GB 73.94%  ast.(*ArrayTypeNode).VisitEachChild
         0     0%  4.45%    32.20GB 73.88%  ast.(*RestTypeNode).VisitEachChild
         0     0%  4.45%    22.49GB 51.59%  ast.(*Node).Clone
         0     0%  4.45%    22.23GB 51.01%  ast.updateNode
         0     0%  4.45%    22.23GB 51.01%  printer.(*EmitContext).SetOriginal (inline)
    9.08GB 20.83% 25.28%    22.23GB 51.01%  printer.(*EmitContext).SetOriginalEx
         0     0% 25.28%    20.31GB 46.59%  ast.(*KeywordTypeNode).Clone
         0     0% 25.28%    18.02GB 41.33%  ast.cloneNode

Read of the breakdown:

  • 74.7% of allocations are inside DeepCloneNode's recursive visitor, walking a TupleType → UnionType → ArrayType → RestType → LiteralType chain. That is the exact AST shape of the recursive type alias described below.
  • 51% is EmitContext.SetOriginalEx and its backing LinkStore — the data structure that records original → clone pointers for every node deep-clone produces. Its backing slice is grown via slices.Grow ~9 GB worth.
  • Flat allocation top hitter is SetOriginalEx at 9.08 GB. The leak is EmitContext metadata, not the AST nodes themselves.

CPU profile — top cumulative (teammate, leak)

Duration: 308.76s, Total samples = 3139.73s (1016.88% — 10-core saturation)

      flat  flat%   sum%        cum   cum%
     0.11s 0.0035% 0.0035%  2975.16s 94.76%  runtime.systemstack
     0.12s 0.0038% 0.0073%  2852.47s 90.85%  runtime.gcBgMarkWorker.func2
     5.84s  0.19%  0.19%    2852.13s 90.84%  runtime.gcDrain
     0.83s 0.026%  0.22%    2841.11s 90.49%  runtime.gcBgMarkWorker
   644.89s 20.54% 20.76%    2603.14s 82.91%  runtime.scanObject
   603.13s 19.21% 39.97%     860.52s 27.41%  runtime.tryDeferToSpanScan
   440.26s 14.02% 53.99%     623.90s 19.87%  runtime.findObject

90% of CPU is in the GC's mark/scan path. The process is heap-thrashed; allocation rate exceeds GC throughput, so GOMEMLIMIT doesn't bound it.

Control profile (same code, no .tsbuildinfo)

File: tsgo                          Type: alloc_space
Total: 3.22 GB                      Total time: 1.77 s

Top of cumulative is normal type-checker work:
- checker.(*Relater).isRelatedToEx                36.75%
- checker.(*Checker).checkTypeRelatedToEx         36.36%
- checker.(*Relater).recursiveTypeRelatedTo       35.21%
- checker.(*Relater).structuredTypeRelatedTo      35.07%

No samples in DeepCloneNode (go tool pprof -focus DeepCloneNode returns "focus expression matched no samples"). The same source, with .tsbuildinfo removed, finishes in 1.77 seconds and never enters the deep-clone path.

The type pattern

The package defines (anonymized; original identifiers replaced with neutral tokens; structure unchanged):

type Doc =
  | string
  | { [k: string]: Doc }
  | readonly ["array", Doc]
  | readonly ["array", Doc, { length: number }]
  | readonly ["array", Doc, { min?: number; max?: number }]
  | readonly ["union", Doc, ...Doc[]];

Six-alternate self-referential union with two variadic tuple branches (["array", T, …] ×3 overloads and ["union", T, ...T[]]). The AST shape Tuple→Union→Array→Rest→Literal reported in the pprof maps 1:1 onto this declaration.

This is the type pattern of a tagged-tuple JSON DSL — same family as Schema.declare(...)'s recursive shapes, structural-editor ASTs, blockchain ABI encodings, ts-json-schema generators, etc. ~50 files within the affected package reference Doc directly; ~110 across the wider monorepo do.

What appears to trigger it

Three conditions seem necessary together. We could not produce the leak with any two of the three:

  1. A self-recursive type alias with multiple variadic-tuple alternates (above).
  2. A non-trivial set of internal consumers + cross-package imports, such that tsgo materializes the type as an AST and deep-clones it across many sites (e.g. for declaration diagnostics — which run even when declaration: false / noEmit: true; see LSP request failure for diagnostics during type serialization #3378 stack).
  3. A pre-existing .tsbuildinfo whose cached symbol/type signatures have desynchronized from the current source. Anything that produces this works: adding a new union alternate to the recursive type; bulk-renaming a heavily-referenced identifier; upgrading @typescript/native-preview without clearing the buildinfo.

When all three are present, tsgo's NodeBuilderImpl.serializeTypeForDeclarationtypeToTypeNodeDeepCloneNode path runs across the dependency graph and EmitContext's LinkStore grows linearly with allocations.

The same code path appears in the stack for #3378 (serializeTypeForDeclaration → typeToTypeNode → DeepCloneNode), which was reported as an LSP crash. Our reproduction is non-LSP — tsgo invoked directly with --pprofDir.

Cross-references

  • microsoft/typescript-go#3378 — same serializeTypeForDeclaration → typeToTypeNode → DeepCloneNode stack; reported as LSP diagnostics crash.
  • microsoft/typescript-go#2917 / #2987 — recursive template-literal memory leak (closed, fixed via truncation check in conditionalTypeToTypeNode). Different recursive-type family from ours.
  • microsoft/typescript-go#1622 — large-monorepo extreme memory; mitigations --singleThreaded, --builders 1, GOMEMLIMIT.

Disclosure

This issue body and the supporting analysis were drafted with assistance from Claude (Anthropic), per CONTRIBUTING.md's AI-disclosure requirement. The pprofs were captured against a real codebase; the type-alias shape shown above is anonymized but structurally identical to the original. I have read the analysis, understand it, and will iterate on any follow-up the maintainers raise.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions