-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Problem
pkg/tcpip/transport/tcp/sack_scoreboard.go uses the old interface-based btree.BTree with btree.Item. Every call to IsSACKED, IsRangeLost, Insert, and Delete boxes header.SACKBlock values into the btree.Item interface, causing heap allocations on every operation.
Under packet loss with active SACK processing, this creates massive allocation pressure. In our production workload (gvisor TCP stack used as a userspace network stack for WireGuard tunnels), pprof heap profiling shows:
SACKScoreboard.IsSACKED: 1055 MB allocated in 30 seconds (35% of all allocations)SACKScoreboard.IsRangeLost: 343 MB allocated in 30 seconds (12% of all allocations)
These are the top two allocators in the entire process and directly led to OOM kills under sustained traffic with packet loss.
Proposed Fix
The google/btree library (currently at v1.1.2 in gvisor's go.mod) has shipped btree.BTreeG[T] — a generic version that eliminates interface boxing — since v1.1.0 (Go 1.18+, 2022).
The migration in sack_scoreboard.go is mechanical:
// Before (current)
ranges *btree.BTree
ranges: btree.New(defaultBtreeDegree)
s.ranges.DescendLessOrEqual(r, func(i btree.Item) bool {
sacked := i.(header.SACKBlock)
// After (proposed)
ranges *btree.BTreeG[header.SACKBlock]
ranges: btree.NewG[header.SACKBlock](defaultBtreeDegree)
s.ranges.DescendLessOrEqual(r, func(sacked header.SACKBlock) bool {This would eliminate all interface boxing/unboxing allocations in the SACK scoreboard hot path, turning the top two allocators into zero-allocation operations.
pprof Evidence
Allocation diff over 30 seconds (two heap samples):
flat cum
1055.02MB 1055.02MB gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*SACKScoreboard).IsSACKED
342.51MB 342.51MB gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*SACKScoreboard).IsRangeLost
58.50MB 800.52MB gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*sender).SetPipe
All of this flat allocation is from btree.Item interface boxing inside the btree traversal callbacks.
Environment
- gvisor version:
v0.0.0-20260122175437-89a5d21be8f0 google/btreeversion:v1.1.2- Go version: 1.25.6
- Workload: userspace TCP/IP stack for WireGuard tunnel (xray-core)