feat(benchmark): add fgprof, block/mutex profiling and improve profile docs by pdrobnjak · Pull Request #2886 · sei-protocol/sei-chain

pdrobnjak · 2026-02-13T13:17:59Z

Summary

Add fgprof (wall-clock profiler) to capture off-CPU time (I/O, blocking, GC pauses) invisible to Go's standard CPU profiler. Registered on DefaultServeMux behind benchmark build tag only — no production impact.
Enable block and mutex contention profiling via runtime.SetBlockProfileRate and runtime.SetMutexProfileFraction, also gated behind benchmark build tag. Uses conservative sampling rates (1us block threshold, 1/5 mutex fraction) to minimize TPS measurement overhead.
Update benchmark-compare.sh to auto-capture all 6 profile types (CPU, fgprof, heap, goroutine, block, mutex) midway through runs instead of just CPU + heap.
Add DURATION env var to benchmark.sh (default 120s, 0 = run forever). When set, runs seid in the background, auto-captures all 6 profiles midway, extracts TPS stats (median/avg/min/max), and exits cleanly — enabling fully automated single-scenario profiling.
Fix CPU/fgprof profile corruption: Switch from parallel background captures to sequential execution. Go's CPU profiler (SIGPROF) and fgprof (runtime.GoroutineProfile) conflict when running concurrently on the same process, producing empty or corrupted profiles. Also measure actual capture duration for accurate remaining-time calculation, and make BASE_DIR overridable so benchmark-compare.sh routes profiles to per-label directories correctly.
Expand benchmark/CLAUDE.md with profile type reference table, CPU-vs-fgprof guidance, heap metric selection guide, interactive flamegraph docs, and a full optimization loop workflow (profile → analyze → discuss → implement → compare → validate → PR).

Test plan

Verify go build ./sei-tendermint/node/ succeeds (no fgprof in non-benchmark build)
Verify go build -tags benchmark ./sei-tendermint/node/ succeeds (fgprof registered)
Verify go build -tags benchmark ./app/ succeeds (block/mutex profiling enabled)
Run DURATION=90 benchmark/benchmark.sh and confirm auto-stop, profile capture, and TPS extraction
Verify all 6 profile files in /tmp/sei-bench/pprof/ with non-zero sizes
Verify go tool pprof -top works on each captured profile type
Verify sequential capture produces valid CPU and fgprof profiles (previously corrupted when captured in parallel)

🤖 Generated with Claude Code

…e docs Add wall-clock profiling (fgprof) alongside standard CPU profiling to capture off-CPU time (I/O, blocking, GC pauses). Register the fgprof handler behind the benchmark build tag so production binaries are unaffected. Enable block and mutex contention profiling via runtime calls, also gated behind the benchmark build tag. Use conservative sampling rates (1us block threshold, 1/5 mutex fraction) to minimize overhead on TPS. Update benchmark-compare.sh to capture all 6 profile types (CPU, fgprof, heap, goroutine, block, mutex) and report sizes for each. Expand benchmark/CLAUDE.md with: - Profile type reference table with when-to-use guidance - CPU vs fgprof explanation - Heap metric selection guide (inuse_space vs alloc_objects etc) - Interactive flamegraph and drill-down commands - Single-scenario manual capture examples - Source-mapping tip for pprof Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-13T13:18:58Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Feb 17, 2026, 3:03 PM

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codecov · 2026-02-13T13:34:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.17%. Comparing base (127ed99) to head (7854f49).

Additional details and impacted files

@@                   Coverage Diff                    @@
##           pd/benchmark-compare    #2886      +/-   ##
========================================================
- Coverage                 57.18%   57.17%   -0.01%     
========================================================
  Files                      2091     2091              
  Lines                    171179   171173       -6     
========================================================
- Hits                      97891    97872      -19     
- Misses                    64578    64593      +15     
+ Partials                   8710     8708       -2

Flag	Coverage Δ
sei-chain	`52.63% <ø> (-0.02%)`	⬇️
sei-cosmos	`48.16% <ø> (+0.02%)`	⬆️
sei-db	`68.72% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.
see 34 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…extraction benchmark.sh now runs for DURATION seconds (default 120), auto-captures all 6 profile types midway, extracts TPS stats, and exits cleanly. DURATION=0 preserves the original run-forever behavior. Also documents the full optimization loop workflow in CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…on loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ion loop Adds a Claude Code command that runs a structured optimization workflow: profile -> analyze -> discuss -> implement -> compare -> validate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… corruption Go's CPU profiler and fgprof conflict when running concurrently on the same process, producing empty or corrupted profiles. Switch from parallel background captures to sequential execution (CPU first, then fgprof), measure actual capture duration for accurate remaining-time calculation, and make BASE_DIR overridable so benchmark-compare.sh can route profiles to per-label directories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

## Summary - Namespace `BASE_DIR` per run via `RUN_ID` (defaults to PID): `/tmp/sei-bench-${RUN_ID}/` - Auto-claim port offset slots via atomic `mkdir` (supports 30 concurrent runs, zero coordination) - Replace `git checkout` with git worktrees for isolated builds (no working tree collisions) - Replace `~/go/bin/seid` with `GOBIN`-based builds per label (no binary collisions) - Replace `~/.sei` staging with `mktemp` + `--home` on all `seid` commands (no init collisions) - Pass `SEI_HOME_DIR`/`SEID_BIN` env vars to `populate_genesis_accounts.py` (backward-compatible defaults) - Fix pre-existing double lifecycle bug by passing `DURATION=0` to child start-phase ## Test plan - [x] Syntax check: `bash -n` on both shell scripts, `py_compile` on Python - [x] Two concurrent `benchmark-compare.sh` runs with `DURATION=120` — both completed, separate `BASE_DIR`s, no port conflicts - [x] All 6 profile types captured for all 4 nodes (CPU ~145KB, fgprof ~115KB, heap ~248KB, etc.) - [x] TPS data collected (36-37 readings per node) - [x] `pprof -diff_base` produces valid analyzable output from both runs - [x] Port slot locks cleaned up after exit - [x] Git worktrees cleaned up after exit - [x] Backward compatible — no env vars needed for single-instance usage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

## Summary - `benchmark.sh` now auto-claims a port offset slot (same atomic `mkdir` mechanism as `benchmark-compare.sh`) when `PORT_OFFSET` is not explicitly set - Prevents port collisions between concurrent standalone `benchmark.sh` runs and stale `seid` processes from crashed runs - When auto-claiming, `SEI_HOME` is also isolated to `$HOME/.sei-bench-<offset>` to avoid data directory collisions - Port slot is released in all exit paths (staging cleanup, seid cleanup trap, and normal exit) - When `PORT_OFFSET` is explicitly set by the caller (e.g., from `benchmark-compare.sh`), behavior is unchanged ## Test plan - [x] Run two concurrent `benchmark.sh` invocations — both should auto-claim different port slots and run without collisions - [x] Run `benchmark-compare.sh` (which passes explicit `PORT_OFFSET`) — should still work as before - [x] Kill a `benchmark.sh` mid-run — port slot should be released by the trap handler - [x] Syntax check: `bash -n benchmark/benchmark.sh` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

## Summary - `benchmark-compare.sh` slot 0 previously mapped to `RUN_PORT_OFFSET=0`, which uses the same default ports as standalone `benchmark.sh` (offset 0) - If a stale `seid` from a previous standalone run is holding those ports, the baseline node in a compare run panics with `bind: address already in use` - Fix: change slot-to-offset mapping from `slot * 1000` to `1000 + slot * 1000`, so compare runs start at offset 1000+ and never overlap with standalone benchmark default ports Complements #2900 which added auto-claim port offsets to `benchmark.sh` itself. ## Test plan - [x] Run `benchmark-compare.sh` — first slot should claim offset 1000, not 0 - [x] Run standalone `benchmark.sh` concurrently with `benchmark-compare.sh` — no port collisions - [x] Multiple concurrent `benchmark-compare.sh` invocations still auto-claim non-overlapping slots 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

chore(benchmark): reduce default compare duration from 600s to 120s

069d6a6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pdrobnjak self-assigned this Feb 13, 2026

pdrobnjak and others added 2 commits February 13, 2026 14:50

docs(benchmark): open cpu, fgprof, and heap flamegraphs in optimizati…

dcbfd5a

…on loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(benchmark): add /optimize command for profiling-driven optimizat…

d22930c

…ion loop Adds a Claude Code command that runs a structured optimization workflow: profile -> analyze -> discuss -> implement -> compare -> validate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pdrobnjak added the non-app-hash-breaking label Feb 13, 2026

pdrobnjak requested review from arajasek, codchen and stevenlanders and removed request for stevenlanders February 13, 2026 14:20

pdrobnjak and others added 4 commits February 13, 2026 18:01

arajasek approved these changes Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(benchmark): add fgprof, block/mutex profiling and improve profile docs#2886

feat(benchmark): add fgprof, block/mutex profiling and improve profile docs#2886
pdrobnjak wants to merge 9 commits intopd/benchmark-comparefrom
pd/benchmark-profiling-improvements

pdrobnjak commented Feb 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pdrobnjak commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pdrobnjak commented Feb 13, 2026 •

edited

Loading

github-actions bot commented Feb 13, 2026 •

edited

Loading

codecov bot commented Feb 13, 2026 •

edited

Loading