perf(list): cache ahead/behind counts SHA-keyed; skip the for-each-ref walk on warm runs by max-sixty · Pull Request #2704 · max-sixty/worktrunk

max-sixty · 2026-05-11T09:15:54Z

Summary

wt list --branches computed branch ahead/behind counts with one git for-each-ref --format='%(ahead-behind:BASE)' walk, on every invocation, in the post-skeleton setup scope. That walk is O(commits) — ~1.5s on rust-lang/rust — and it runs serially before the parallel task pool opens, because the per-branch tasks consume its result. So on a large real repo roughly 40% of wt list's wall time was a single-threaded git graph walk that thread count couldn't help with (a follow-up from the picker thread-count investigation in #2683/#2685).

Ahead/behind for (base, branch) is a pure function of the two commit SHAs — content-addressed, never stale — exactly the shape of the existing .git/wt/cache/ SHA-keyed caches. This adds an ahead-behind/ cache kind and rewires the population path:

New ahead-behind/{base_sha}-{head_sha}.json kind in sha_cache.rs, identical machinery to is-ancestor / diff-stats (LRU-bounded, cleared by wt config state clear, wiped by the bench cache-invalidator).
Repository::ahead_behind_by_sha is now cache-backed: check → compute_ahead_behind → write.
RefSnapshot::capture_refs_with_ahead_behind builds the snapshot's ahead/behind map from the cache and only walks the graph for the branches the cache doesn't cover:
- a few misses → left out of the snapshot map; the per-branch AheadBehindTask recomputes (and caches) them by SHA in the parallel pool. (That task already runs merge_base_by_sha for its orphan check, so the merge-base compute_ahead_behind needs is already primed — the fallback is just two cheap rev-list --count calls.)
- everything cold (fresh repo / after wt config state clear) → one unscoped for-each-ref %(ahead-behind:BASE_SHA) refs/heads/ walk — a single shared traversal finds every merge-base — then results written to the cache.
- many cold but not all → one for-each-ref %(ahead-behind:BASE_SHA) scoped to just the missed refnames (git for-each-ref takes a ref list, so it's still one shared traversal, only over the cold subset), results written to the cache.

The few-vs-many threshold (AHEAD_BEHIND_SCOPED_BATCH_MIN_MISSES) trades the per-pair path's parallelism (it runs in the pool) against the batch's lower total work (one shared base-history traversal); a batch here is serial, so it's only worth it once "many" misses make the amortization clear.

Cache-correctness details (Codex-prompted)

A few sharp edges, surfaced by /review-codex, that the seeding path now handles explicitly:

%(ahead-behind) computed against the resolved SHA, not the refname. A tag named main shadows the branch in git's ref resolution; passing base_sha to the batch makes it inert to that. The fallback path (default branch isn't a local/remote-tracking ref) still passes the refname and caches nothing.
Cache key from the batch's %(objectname), not a separately-scanned value. If a refs/heads/X moved between scan_locals and the batch (a sub-ms window, but possible), the counts are for the new SHA — and the cache now stores them under that SHA, not under the staler b.commit_sha.
Orphan branches normalized to (0, 0) before writing. Git's %(ahead-behind:BASE) for an orphan (no common ancestor) prints the two disjoint history sizes; compute_ahead_behind returns (0, 0) and signals orphan-ness separately. An orphan's behind count equals base's total commit count, so one git rev-list --count base_sha on the seeding path detects them. The cache invariant — cache hit equals what a miss would recompute — holds for orphans now.
Bulk-write avoids O(N · dir_size) on the seeding path. write_with_lru scans the cache directory on every call to enforce the LRU bound; the serial setup-scope seeding loop calls it N times. put_ahead_behind_bulk writes all N entries first and sweeps once at the end.

Net effect

	before	after
warm `wt list` on a large repo	~1.5s serial `for-each-ref %(ahead-behind)` in the setup scope	N small cache reads; no graph walk; no serial blocker
`wt list` after committing on one branch	full ~1.5s walk again (batch op — any change re-walks all)	only that branch recomputed (two cheap `rev-list --count`, in the pool); the rest stay cached
many branches moved since last list	full walk	one `for-each-ref %(ahead-behind)` scoped to just the moved ones
cold (fresh repo / after `state clear`)	one combined walk	unchanged — one combined walk + one bulk cache seed
skeleton time (`WORKTRUNK_SKELETON_ONLY`)	—	unaffected (returns before the scope this touches)

No user-facing behavior change — same numbers, same columns. The picker (wt switch) shares this path, so its preview pre-compute benefits identically.

Follow-ups left as TODOs in the code

TODO(ahead-behind-pool) (src/commands/list/collect/mod.rs): the cold-cache %(ahead-behind) walk still runs serially in the setup scope, blocking the task pool from opening. Nothing downstream of work-item generation needs the counts (only the per-row AheadBehindTask, which has a per-SHA fallback) — only the cheap ref scan does. So the walk could become a single work item in the pool, overlapping the other ~N workers. Needs an inter-task dependency the work-item model doesn't have today.
TODO(remote-ahead-behind-batch) (src/commands/list/collect/tasks.rs): the Remote⇅ column's per-branch ahead_behind_by_sha(upstream, branch) is already cache-backed by this change, but has no cold-start batch primer. One for-each-ref --format='%(refname) %(upstream:track,nobracket)' refs/heads/ would fill the ahead-behind/ cache for it in a single walk, mirroring what %(ahead-behind:main) does for main↕.

Testing

sha_cache.rs: round-trip + cache-read tests for the new kind; test_clear_all_covers_all_kinds extended.
ref_snapshot.rs: all-cold→unscoped-batch + cold→warm second-capture reads the persistent cache; partial-warm (few misses) omits the moved branch; many-misses uses the scoped batch; orphan normalizes to (0,0) in both the snapshot and the cache file; remote-tracking base resolves; unresolvable base degrades cleanly.
Smoke-tested manually: wt list --branches populates .git/wt/cache/ahead-behind/; tampering an entry → the next run shows the tampered counts (proves the read path).
Two passes of /review-codex — second pass clean.
cargo run -- hook pre-merge --yes — full suite + lints + doc-sync (3595 tests).

Out of scope (flagged, not done here)

docs/dev/cache-staleness.md still references the in-memory RepoCache.ahead_behind field and batch_ahead_behind — both removed when RefSnapshot landed. Pre-existing staleness; rewriting that doc is a separate cleanup. (The pre-skeleton git log --no-walk commit-details batch is also content-addressed by SHA and could in principle be cache-first, but it's already O(refs) and fast — not worth a TODO.)

🤖 Generated with Claude Code

…f walk on warm runs `wt list --branches` populated branch ahead/behind counts via one `git for-each-ref --format='%(ahead-behind:BASE)'` walk in the post-skeleton setup scope, every run. That walk is O(commits) — ~1.5s on rust-lang/rust — and it ran serially before the parallel task pool opened (the per-branch tasks consume its result), so on a large repo ~40% of `wt list`'s wall was a single-threaded git graph walk no amount of parallelism could touch. Ahead/behind for `(base, branch)` is a pure function of the two commit SHAs — never stale — so this adds an `ahead-behind/` SHA-keyed cache kind (same machinery as `is-ancestor`, `diff-stats`, etc.): - `Repository::ahead_behind_by_sha` is now cache-backed (check → compute → write). - `capture_refs_with_ahead_behind` builds the snapshot's ahead/behind map from the cache and reaches for a `for-each-ref %(ahead-behind)` walk only for branches the cache doesn't cover: - a few misses → left out of the snapshot map; their per-branch `AheadBehindTask` recomputes (and caches) by SHA in the parallel pool (the merge-base it needs is already primed by that task's orphan check, so it's two cheap `rev-list --count` calls); - everything cold (fresh repo / after `wt config state clear`) → one unscoped `for-each-ref %(ahead-behind:BASE_SHA) refs/heads/` walk — a single shared traversal finds every merge-base — then the results are written to the cache; - many cold but not all → one `for-each-ref %(ahead-behind:BASE_SHA)` scoped to just the missed refnames (same shared-traversal win, only over the cold subset), results written to the cache. The walk computes against `base`'s resolved SHA (not the refname, which git could resolve to a different commit if a tag shadows the branch) and keys the cache by the *object SHA the batch reports for each branch* (via `%(objectname)`) — so a ref that moved between the initial scan and the batch can't poison entries either. Orphan branches (no common ancestor with base) are normalized to `(0, 0)` to match `compute_ahead_behind` / a cache miss; an orphan's `behind` count equals base's total commit count, detected with one extra `rev-list --count` on the seeding path. Bulk seeding goes through `put_ahead_behind_bulk` — one `sweep_lru` at the end, not one per entry — so the serial setup scope avoids the O(N · dir_size) per-write directory scan that `write_with_lru` does. Net: warm `wt list` on a large repo no longer pays the serial ~1.5s walk — it's N small cache reads, and the only graph work left runs inside the parallel pool. A `wt list` right after committing on a branch recomputes only that one branch's counts, not the whole batch. Cold runs are unchanged (one combined walk, then cached). Skeleton-time is unaffected (`WORKTRUNK_SKELETON_ONLY` returns before the scope this touches). Two TODOs left for follow-ups in the same area: - `TODO(ahead-behind-pool)` (`collect/mod.rs`): move the cold-cache `%(ahead-behind)` walk off the serial setup scope and into the task pool as a work item, so it overlaps the other workers — needs an inter-task dependency the work-item model lacks today. - `TODO(remote-ahead-behind-batch)` (`collect/list/tasks.rs`): the `Remote⇅` column's per-branch `ahead_behind_by_sha(upstream, branch)` is already cache-backed by this change, but lacks a cold-start batch primer; one `for-each-ref %(upstream:track,nobracket)` could fill the cache for it the way `%(ahead-behind:main)` does for `main↕`. Docs: the `collect/mod.rs` cache-architecture docstring moves `AheadBehind` out of "already optimized (not a cache candidate)" into the cached-tasks table; `ref_snapshot.rs` and `sha_cache.rs` docstrings updated; the FAQ's "what files does Worktrunk create?" cache-kinds list gains ahead/behind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The doc proposed and analyzed six options for closing the ref-keyed cache staleness class; the codebase adopted Option 5 (`RefSnapshot`) in #2528 and extended it with the content-addressed `ahead-behind/` SHA cache in #2704. The structural contract now lives in `src/git/repository/ref_snapshot.rs` (snapshot semantics, lifetime, construction) and the cache-kind landscape in `src/commands/list/collect/mod.rs`'s Caching section, both of which supersede what was in this file. What remained in the doc actively misleads: its `RepoCache` inventory table cites fields that no longer exist (`commit_shas`, `tree_shas`, `effective_integration_targets`, `integration_reasons`, `ahead_behind`, `head_shas`) and references a `batch_ahead_behind` helper that's been replaced. The file was an island — no inbound links from code, tests, or other docs, and `docs/dev/` is not a Zola content directory so it was never published — so deletion is clean. Anyone curious about the original design debate can find it at the merge commit for #2528. Co-authored-by: Claude <noreply@anthropic.com>

## Summary - Adds `picker_preview` benchmark group measuring "process spawn → all preview tasks drained" for `wt switch`'s interactive picker. - Introduces `WORKTRUNK_PREVIEW_BENCH=1`, an early-exit gate inside `handle_picker` that runs the full prelude (collect, speculative spawn, skeleton, initial + deferred precompute, `orchestrator.wait_for_idle()`) and returns before skim launches or any JSON / stderr I/O. Shares the dry-run path; behavior with the env var unset is unchanged. - Closes the coverage gap behind #2662 / #2683 / #2685 / #2704, which were tuned against `wt list` as a proxy because no direct picker measurement existed. ## Why this measurement Picker submits one preview-compute task per row to the global rayon pool. The user-visible quantity to optimize is the responsiveness window between picker launch and "all previews ready" (j/k navigation hits cached content). Option 1 from the task — headless wall clock to drain — is the cleanest measurable proxy and avoids the PTY route, which hits the documented nextest/SIGTTOU pain on `shell-integration-tests`. PTY-driven first-interactive-ready can be a follow-up. ## Variants - `picker_preview/warm/typical-8` - `picker_preview/cold/typical-8` Cold uses `BatchSize::PerIteration` (not `SmallInput`): `SmallInput` calls `setup` for an entire batch up front and then runs timed routines back-to-back, so only the first iter in each batch is genuinely cold — the rest hit a freshly populated `.git/wt/cache/`. `PerIteration` invalidates immediately before every measured iteration; setup is far cheaper than `wt switch`, so per-iter `Instant::now` doesn't dominate. `sample_size(10)` + `measurement_time(35s)` per #2685's lead — slow benches don't benefit from the default 30 samples. `cfg(unix)`-gated with a no-op `main` on Windows; the picker is Unix-only and `wt switch` (no args) hits the unavailable path before the env var is consulted. ## Sample run ``` picker_preview/warm/typical-8 time: [185.62 ms 191.72 ms 200.77 ms] picker_preview/cold/typical-8 time: [209.34 ms 226.23 ms 239.29 ms] ``` ## Test plan - [x] `cargo bench --bench picker_preview` runs cleanly on both variants - [x] `cargo run -- hook pre-merge --yes` — 3667 tests pass - [x] New `test_picker_preview_bench_produces_no_output` asserts `WORKTRUNK_PREVIEW_BENCH=1` keeps stdout/stderr empty (covers the env-gated branch, locks the no-I/O contract) - [x] Smoke test: `wt switch` with `WORKTRUNK_PREVIEW_BENCH` unset still hits the TTY error path (user-visible behavior unchanged) - [x] Smoke test: `WORKTRUNK_PICKER_DRY_RUN=1` still emits the cache JSON dump (regression check) - [x] `/review-codex` pass clean after iterating on three findings (packed-refs fix already on `main` via #2697 once branch was rebased; `BatchSize::PerIteration` for true per-iter invalidation; `cfg(unix)` gate for Windows) > _This was written by Claude Code on behalf of Maximilian Roos_ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…oup (#2718) ## Summary - Mirrors the `main↕` cold-start primer from #2704 for the `Remote⇅` column. `Repository::prime_upstream_ahead_behind_cache` runs one `for-each-ref %(ahead-behind:UPSTREAM_SHA)` walk per unique upstream SHA (scoped to that upstream's branches) and bulk-writes the `ahead-behind/` cache. Per-row `UpstreamTask::compute` reads the cache via `ahead_behind_by_sha`. - Nested inside the snapshot spawn in the post-skeleton `rayon::scope` so it reads the snapshot's already-scanned local/remote inventories — no extra `for-each-ref refs/remotes/` fork. Candidates scoped to branches that actually render an `UpstreamTask` row (worktree-attached when `!show_branches`). Primer honors `list.task-timeout-ms` via `set_command_timeout`. - Cache-correctness defenses match #2704: `%(ahead-behind:UPSTREAM_SHA)` (not `:refname`) so git counts against the SHA the cache will be keyed by; cache key uses the `%(objectname)` git reports in the same walk; writes flow through `put_ahead_behind_bulk` (one `sweep_lru` at end); orphan branches normalize to `(0, 0)` via per-upstream memoized `rev-list --count`. ### Known limitation When every branch tracks its own remote (distinct upstream SHA per branch), every group has `refs.len() == 1` and the primer skips all of them — those fall through to the per-row parallel path. A `%(upstream:track)` walk could batch them, but git computes that atom against the upstream's current value at walk time, which we can't pin to the SHA the cache will be keyed by; the resulting race would break the cache invariant `compute_ahead_behind` establishes on a miss. Documented in the primer's docstring. ### Before/after on a 10-branch shared-upstream fixture Cold `wt list --branches` trace (`RUST_LOG=worktrunk=debug`): - `for-each-ref --format=%(refname) %(objectname) %(ahead-behind:<main_sha>) refs/heads/` — snapshot's `main↕` walk (unscoped) - `for-each-ref --format=%(refname) %(objectname) %(ahead-behind:<feat-base_sha>) refs/heads/feat-base refs/heads/work1 ... refs/heads/work10` — primer's scoped Remote⇅ walk - 23 cache entries written (12 keyed by main, 11 keyed by feat-base) Warm run: `grep ahead-behind:` returns 0 — both walks skipped, cache hits everywhere. ## Test plan - [x] 9 new primer tests in `ref_snapshot.rs` mirror the `capture_ahead_behind` patterns (cold→warm tampered-cache survival, no-upstream / `[gone]` skips, orphan normalization, sub-threshold skip, two-upstream-group routing, equal-branch `(0,0)`, local-upstream `branch.X.remote = .` resolution) - [x] `cargo run -- hook pre-merge --yes` — 3664 tests pass - [x] `/review-codex` iterated to convergence on cache correctness, candidate scoping, redundant remote-scan, and timeout coverage Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@mikeyroush

Release v0.50.0. Highlights: - Experimental Azure DevOps support (#1256, thanks @mikeyroush; fixes #1144 from @dlecan) — `wt switch pr:<N>`, `wt list --full`, and `wt config show --full` recognize Azure DevOps via the `az` CLI. - Experimental Gitea CI-status detection (#2702) on top of the Gitea `pr:` shortcut (#1320, thanks @SjB). - Hooks now resolve `.config/wt.toml` from the worktree they act on — the primary-worktree fallback is gone, and `post-remove` reads the removed worktree's config (snapshotted before removal). Approval prompts collect hook commands from the same worktree. Breaking change for setups that relied on the primary-worktree fallback; the changelog entry has the recovery action. (#2690, #2703, #2714, #2717, #2701, #2708, #2727, #2736, #2748) - The `wt switch` picker's `alt-r` removal no longer runs unapproved project hooks (#2746) — the picker's removal path is now routed through `handle_remove_output` and consults the existing approval state read-only. - `wt config alias show` with no name lists every alias's full definition (#2684, #2691); `wt --help` switches to a compact aliases pointer (#2688). - `wt list --branches` warm-run perf: SHA-keyed cache for `main↕` and `Remote⇅` ahead/behind counts; shared push-remote URL and local-branch scan (#2704, #2718, #2673). - Claude Code plugin ships the `wt-switch-create` skill (#2737, thanks @onetom for #2631). See `CHANGELOG.md` for the full list (8 Improved, 5 Fixed, 5 Internal). semver-checks reports breaking library-API changes (new enum variants without `#[non_exhaustive]`, removed `Branch::github_push_url`, new trait method on `RemoteRefProvider`), which mandates at minimum a minor bump pre-1.0.

worktrunk-bot approved these changes May 11, 2026

View reviewed changes

max-sixty merged commit 0b088b5 into main May 11, 2026
34 checks passed

max-sixty deleted the picker-thread-count-bench branch May 11, 2026 09:27

max-sixty mentioned this pull request May 11, 2026

docs: drop stale docs/dev/cache-staleness.md #2710

Merged

This was referenced May 11, 2026

perf(list): cold-start primer for the Remote⇅ column, per-upstream-group #2718

Merged

bench: measure wt switch picker preview pre-compute workload #2721

Merged

max-sixty mentioned this pull request May 13, 2026

Release v0.50.0 #2750

Merged

BrewTestBot mentioned this pull request May 13, 2026

worktrunk 0.50.0 Homebrew/homebrew-core#282437

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(list): cache ahead/behind counts SHA-keyed; skip the for-each-ref walk on warm runs#2704

perf(list): cache ahead/behind counts SHA-keyed; skip the for-each-ref walk on warm runs#2704
max-sixty merged 1 commit into
mainfrom
picker-thread-count-bench

max-sixty commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

max-sixty commented May 11, 2026

Summary

Cache-correctness details (Codex-prompted)

Net effect

Follow-ups left as TODOs in the code

Testing

Out of scope (flagged, not done here)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants