Skip to content

tcp: migrate SACKScoreboard from btree.BTree to btree.BTreeG to eliminate heap allocations #12596

@quetz

Description

@quetz

Problem

pkg/tcpip/transport/tcp/sack_scoreboard.go uses the old interface-based btree.BTree with btree.Item. Every call to IsSACKED, IsRangeLost, Insert, and Delete boxes header.SACKBlock values into the btree.Item interface, causing heap allocations on every operation.

Under packet loss with active SACK processing, this creates massive allocation pressure. In our production workload (gvisor TCP stack used as a userspace network stack for WireGuard tunnels), pprof heap profiling shows:

  • SACKScoreboard.IsSACKED: 1055 MB allocated in 30 seconds (35% of all allocations)
  • SACKScoreboard.IsRangeLost: 343 MB allocated in 30 seconds (12% of all allocations)

These are the top two allocators in the entire process and directly led to OOM kills under sustained traffic with packet loss.

Proposed Fix

The google/btree library (currently at v1.1.2 in gvisor's go.mod) has shipped btree.BTreeG[T] — a generic version that eliminates interface boxing — since v1.1.0 (Go 1.18+, 2022).

The migration in sack_scoreboard.go is mechanical:

// Before (current)
ranges    *btree.BTree
ranges:    btree.New(defaultBtreeDegree)
s.ranges.DescendLessOrEqual(r, func(i btree.Item) bool {
    sacked := i.(header.SACKBlock)

// After (proposed)
ranges    *btree.BTreeG[header.SACKBlock]
ranges:    btree.NewG[header.SACKBlock](defaultBtreeDegree)
s.ranges.DescendLessOrEqual(r, func(sacked header.SACKBlock) bool {

This would eliminate all interface boxing/unboxing allocations in the SACK scoreboard hot path, turning the top two allocators into zero-allocation operations.

pprof Evidence

Allocation diff over 30 seconds (two heap samples):

     flat    cum
1055.02MB  1055.02MB  gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*SACKScoreboard).IsSACKED
 342.51MB   342.51MB  gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*SACKScoreboard).IsRangeLost
  58.50MB   800.52MB  gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*sender).SetPipe

All of this flat allocation is from btree.Item interface boxing inside the btree traversal callbacks.

Environment

  • gvisor version: v0.0.0-20260122175437-89a5d21be8f0
  • google/btree version: v1.1.2
  • Go version: 1.25.6
  • Workload: userspace TCP/IP stack for WireGuard tunnel (xray-core)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions