[EBPF] gpu: replace map with fixed-size array for memory usage tracking by pjo256 · Pull Request #46799 · DataDog/datadog-agent

pjo256 · 2026-02-23T13:13:52Z

What does this PR do?

Replaces map[memAllocType]uint64 with [memAllocTypeCount]uint64 fixed-size arrays in the GPU monitoring hot path. This affects two locations:

kernelSpan.avgMemoryUsage in stream.go — accessed on every CUDA synchronization event
memTsBuilders in aggregator.go — used when computing per-process GPU memory stats\

If new memory types are added in the future, adding a new constant before memAllocTypeCount in the memAllocType enum is all that's needed.

Motivation

memAllocType is a 4-value enum (kernel binary, global, shared, constant memory). Using a map here adds overhead: hash computation, bucket lookups, and heap allocations on every access. A fixed-size array replaces this with direct indexing.

A benchmarking script shows 6-23x speedup depending on workload (number of kernel launches between syncs), with allocations dropping from 240 B/op to 64 B/op.

Describe how you validated your changes

Existing unit tests in stream_test.go and stats_test.go already index avgMemoryUsage by memAllocType constants (e.g., span.avgMemoryUsage[sharedMemAlloc]), which works for both arrays and maps.

Wrote standalone microbenchmarks reproducing the exact access patterns of getCurrentKernelSpan and getRawStats, comparing map vs array across varying kernel launch counts.

Launches	Map	Array	Speedup (per sync call)
1	260 ns	29 ns	9.0x
10	555 ns	42 ns	13.2x
100	3.4 μs	164 ns	20.5x
500	15.9 μs	700 ns	22.6x
1000	31.5 μs	1.4 μs	23.0x

In a production setting, serving an LLM like Qwen3.5-397B-A17B, we might expect 1500-3000+ kernel launches per forward pass and 10+ forward passes per second - so this can cumulatively save ~20+ms of agent CPU time per second.

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

github-actions · 2026-02-23T13:14:06Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

pjo256 · 2026-02-23T13:16:41Z

I have read the CLA Document and I hereby sign the CLA

gjulianm · 2026-02-23T14:50:52Z

Hi @pjo256. From a first look it looks good, I'll run the full test pipeline and will make a full review later.

Thanks a lot!

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

pjo256 · 2026-02-23T17:57:40Z

@gjulianm Thanks! I've added a reno release note in the latest commit, let me know if anything else needs adjusting.

gjulianm · 2026-02-24T19:38:35Z

Hi @pjo256, CI is green. Just one minor change, could you remove the changelog note? We don't usually write them for changes like these to avoid having a massive changelog :D

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

pjo256 · 2026-02-24T19:53:11Z

@gjulianm Done! I saw a failing release-notes check earlier and wasn't sure from the Reno docs 👍

[EBPF] gpu: replace map with fixed-size array for memory usage tracking

6a4f675

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

pjo256 requested a review from a team as a code owner February 23, 2026 13:13

dd-octo-sts bot added community team/ebpf-platform labels Feb 23, 2026

release note

e3a76c0

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

gjulianm added qa/done QA done before merge and regressions are covered by tests changelog/no-changelog labels Feb 24, 2026 — with Graphite App

docs: remove release note

5b8890b

Signed-off-by: Philip Ottesen <phiott256@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[EBPF] gpu: replace map with fixed-size array for memory usage tracking#46799

[EBPF] gpu: replace map with fixed-size array for memory usage tracking#46799
pjo256 wants to merge 3 commits intoDataDog:mainfrom
pjo256:gpu-perf-map-to-array

pjo256 commented Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

pjo256 commented Feb 23, 2026

Uh oh!

gjulianm commented Feb 23, 2026

Uh oh!

pjo256 commented Feb 23, 2026

Uh oh!

gjulianm commented Feb 24, 2026

Uh oh!

pjo256 commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

pjo256 commented Feb 23, 2026

What does this PR do?

Motivation

Describe how you validated your changes

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pjo256 commented Feb 23, 2026

Uh oh!

gjulianm commented Feb 23, 2026

Uh oh!

pjo256 commented Feb 23, 2026

Uh oh!

gjulianm commented Feb 24, 2026

Uh oh!

pjo256 commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Feb 23, 2026 •

edited

Loading

pjo256 commented Feb 24, 2026 •

edited

Loading