perf(list): cache ahead/behind counts SHA-keyed; skip the for-each-ref walk on warm runs#2704
Merged
Merged
Conversation
…f walk on warm runs
`wt list --branches` populated branch ahead/behind counts via one
`git for-each-ref --format='%(ahead-behind:BASE)'` walk in the
post-skeleton setup scope, every run. That walk is O(commits) — ~1.5s on
rust-lang/rust — and it ran serially before the parallel task pool opened
(the per-branch tasks consume its result), so on a large repo ~40% of
`wt list`'s wall was a single-threaded git graph walk no amount of
parallelism could touch.
Ahead/behind for `(base, branch)` is a pure function of the two commit
SHAs — never stale — so this adds an `ahead-behind/` SHA-keyed cache kind
(same machinery as `is-ancestor`, `diff-stats`, etc.):
- `Repository::ahead_behind_by_sha` is now cache-backed (check → compute
→ write).
- `capture_refs_with_ahead_behind` builds the snapshot's ahead/behind
map from the cache and reaches for a `for-each-ref %(ahead-behind)`
walk only for branches the cache doesn't cover:
- a few misses → left out of the snapshot map; their per-branch
`AheadBehindTask` recomputes (and caches) by SHA in the parallel pool
(the merge-base it needs is already primed by that task's orphan
check, so it's two cheap `rev-list --count` calls);
- everything cold (fresh repo / after `wt config state clear`) → one
unscoped `for-each-ref %(ahead-behind:BASE_SHA) refs/heads/` walk —
a single shared traversal finds every merge-base — then the results
are written to the cache;
- many cold but not all → one `for-each-ref %(ahead-behind:BASE_SHA)`
scoped to just the missed refnames (same shared-traversal win, only
over the cold subset), results written to the cache.
The walk computes against `base`'s resolved SHA (not the refname, which
git could resolve to a different commit if a tag shadows the branch) and
keys the cache by the *object SHA the batch reports for each branch*
(via `%(objectname)`) — so a ref that moved between the initial scan
and the batch can't poison entries either. Orphan branches (no common
ancestor with base) are normalized to `(0, 0)` to match
`compute_ahead_behind` / a cache miss; an orphan's `behind` count equals
base's total commit count, detected with one extra `rev-list --count`
on the seeding path. Bulk seeding goes through `put_ahead_behind_bulk`
— one `sweep_lru` at the end, not one per entry — so the serial setup
scope avoids the O(N · dir_size) per-write directory scan that
`write_with_lru` does.
Net: warm `wt list` on a large repo no longer pays the serial ~1.5s walk
— it's N small cache reads, and the only graph work left runs inside the
parallel pool. A `wt list` right after committing on a branch recomputes
only that one branch's counts, not the whole batch. Cold runs are
unchanged (one combined walk, then cached). Skeleton-time is unaffected
(`WORKTRUNK_SKELETON_ONLY` returns before the scope this touches).
Two TODOs left for follow-ups in the same area:
- `TODO(ahead-behind-pool)` (`collect/mod.rs`): move the cold-cache
`%(ahead-behind)` walk off the serial setup scope and into the task
pool as a work item, so it overlaps the other workers — needs an
inter-task dependency the work-item model lacks today.
- `TODO(remote-ahead-behind-batch)` (`collect/list/tasks.rs`): the
`Remote⇅` column's per-branch `ahead_behind_by_sha(upstream, branch)`
is already cache-backed by this change, but lacks a cold-start batch
primer; one `for-each-ref %(upstream:track,nobracket)` could fill the
cache for it the way `%(ahead-behind:main)` does for `main↕`.
Docs: the `collect/mod.rs` cache-architecture docstring moves `AheadBehind`
out of "already optimized (not a cache candidate)" into the cached-tasks
table; `ref_snapshot.rs` and `sha_cache.rs` docstrings updated; the FAQ's
"what files does Worktrunk create?" cache-kinds list gains ahead/behind.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
worktrunk-bot
approved these changes
May 11, 2026
max-sixty
added a commit
that referenced
this pull request
May 11, 2026
The doc proposed and analyzed six options for closing the ref-keyed cache staleness class; the codebase adopted Option 5 (`RefSnapshot`) in #2528 and extended it with the content-addressed `ahead-behind/` SHA cache in #2704. The structural contract now lives in `src/git/repository/ref_snapshot.rs` (snapshot semantics, lifetime, construction) and the cache-kind landscape in `src/commands/list/collect/mod.rs`'s Caching section, both of which supersede what was in this file. What remained in the doc actively misleads: its `RepoCache` inventory table cites fields that no longer exist (`commit_shas`, `tree_shas`, `effective_integration_targets`, `integration_reasons`, `ahead_behind`, `head_shas`) and references a `batch_ahead_behind` helper that's been replaced. The file was an island — no inbound links from code, tests, or other docs, and `docs/dev/` is not a Zola content directory so it was never published — so deletion is clean. Anyone curious about the original design debate can find it at the merge commit for #2528. Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 11, 2026
max-sixty
added a commit
that referenced
this pull request
May 11, 2026
## Summary - Adds `picker_preview` benchmark group measuring "process spawn → all preview tasks drained" for `wt switch`'s interactive picker. - Introduces `WORKTRUNK_PREVIEW_BENCH=1`, an early-exit gate inside `handle_picker` that runs the full prelude (collect, speculative spawn, skeleton, initial + deferred precompute, `orchestrator.wait_for_idle()`) and returns before skim launches or any JSON / stderr I/O. Shares the dry-run path; behavior with the env var unset is unchanged. - Closes the coverage gap behind #2662 / #2683 / #2685 / #2704, which were tuned against `wt list` as a proxy because no direct picker measurement existed. ## Why this measurement Picker submits one preview-compute task per row to the global rayon pool. The user-visible quantity to optimize is the responsiveness window between picker launch and "all previews ready" (j/k navigation hits cached content). Option 1 from the task — headless wall clock to drain — is the cleanest measurable proxy and avoids the PTY route, which hits the documented nextest/SIGTTOU pain on `shell-integration-tests`. PTY-driven first-interactive-ready can be a follow-up. ## Variants - `picker_preview/warm/typical-8` - `picker_preview/cold/typical-8` Cold uses `BatchSize::PerIteration` (not `SmallInput`): `SmallInput` calls `setup` for an entire batch up front and then runs timed routines back-to-back, so only the first iter in each batch is genuinely cold — the rest hit a freshly populated `.git/wt/cache/`. `PerIteration` invalidates immediately before every measured iteration; setup is far cheaper than `wt switch`, so per-iter `Instant::now` doesn't dominate. `sample_size(10)` + `measurement_time(35s)` per #2685's lead — slow benches don't benefit from the default 30 samples. `cfg(unix)`-gated with a no-op `main` on Windows; the picker is Unix-only and `wt switch` (no args) hits the unavailable path before the env var is consulted. ## Sample run ``` picker_preview/warm/typical-8 time: [185.62 ms 191.72 ms 200.77 ms] picker_preview/cold/typical-8 time: [209.34 ms 226.23 ms 239.29 ms] ``` ## Test plan - [x] `cargo bench --bench picker_preview` runs cleanly on both variants - [x] `cargo run -- hook pre-merge --yes` — 3667 tests pass - [x] New `test_picker_preview_bench_produces_no_output` asserts `WORKTRUNK_PREVIEW_BENCH=1` keeps stdout/stderr empty (covers the env-gated branch, locks the no-I/O contract) - [x] Smoke test: `wt switch` with `WORKTRUNK_PREVIEW_BENCH` unset still hits the TTY error path (user-visible behavior unchanged) - [x] Smoke test: `WORKTRUNK_PICKER_DRY_RUN=1` still emits the cache JSON dump (regression check) - [x] `/review-codex` pass clean after iterating on three findings (packed-refs fix already on `main` via #2697 once branch was rebased; `BatchSize::PerIteration` for true per-iter invalidation; `cfg(unix)` gate for Windows) > _This was written by Claude Code on behalf of Maximilian Roos_ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
max-sixty
added a commit
that referenced
this pull request
May 12, 2026
…oup (#2718) ## Summary - Mirrors the `main↕` cold-start primer from #2704 for the `Remote⇅` column. `Repository::prime_upstream_ahead_behind_cache` runs one `for-each-ref %(ahead-behind:UPSTREAM_SHA)` walk per unique upstream SHA (scoped to that upstream's branches) and bulk-writes the `ahead-behind/` cache. Per-row `UpstreamTask::compute` reads the cache via `ahead_behind_by_sha`. - Nested inside the snapshot spawn in the post-skeleton `rayon::scope` so it reads the snapshot's already-scanned local/remote inventories — no extra `for-each-ref refs/remotes/` fork. Candidates scoped to branches that actually render an `UpstreamTask` row (worktree-attached when `!show_branches`). Primer honors `list.task-timeout-ms` via `set_command_timeout`. - Cache-correctness defenses match #2704: `%(ahead-behind:UPSTREAM_SHA)` (not `:refname`) so git counts against the SHA the cache will be keyed by; cache key uses the `%(objectname)` git reports in the same walk; writes flow through `put_ahead_behind_bulk` (one `sweep_lru` at end); orphan branches normalize to `(0, 0)` via per-upstream memoized `rev-list --count`. ### Known limitation When every branch tracks its own remote (distinct upstream SHA per branch), every group has `refs.len() == 1` and the primer skips all of them — those fall through to the per-row parallel path. A `%(upstream:track)` walk could batch them, but git computes that atom against the upstream's current value at walk time, which we can't pin to the SHA the cache will be keyed by; the resulting race would break the cache invariant `compute_ahead_behind` establishes on a miss. Documented in the primer's docstring. ### Before/after on a 10-branch shared-upstream fixture Cold `wt list --branches` trace (`RUST_LOG=worktrunk=debug`): - `for-each-ref --format=%(refname) %(objectname) %(ahead-behind:<main_sha>) refs/heads/` — snapshot's `main↕` walk (unscoped) - `for-each-ref --format=%(refname) %(objectname) %(ahead-behind:<feat-base_sha>) refs/heads/feat-base refs/heads/work1 ... refs/heads/work10` — primer's scoped Remote⇅ walk - 23 cache entries written (12 keyed by main, 11 keyed by feat-base) Warm run: `grep ahead-behind:` returns 0 — both walks skipped, cache hits everywhere. ## Test plan - [x] 9 new primer tests in `ref_snapshot.rs` mirror the `capture_ahead_behind` patterns (cold→warm tampered-cache survival, no-upstream / `[gone]` skips, orphan normalization, sub-threshold skip, two-upstream-group routing, equal-branch `(0,0)`, local-upstream `branch.X.remote = .` resolution) - [x] `cargo run -- hook pre-merge --yes` — 3664 tests pass - [x] `/review-codex` iterated to convergence on cache correctness, candidate scoping, redundant remote-scan, and timeout coverage Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
max-sixty
added a commit
that referenced
this pull request
May 13, 2026
Release v0.50.0. Highlights: - Experimental Azure DevOps support (#1256, thanks @mikeyroush; fixes #1144 from @dlecan) — `wt switch pr:<N>`, `wt list --full`, and `wt config show --full` recognize Azure DevOps via the `az` CLI. - Experimental Gitea CI-status detection (#2702) on top of the Gitea `pr:` shortcut (#1320, thanks @SjB). - Hooks now resolve `.config/wt.toml` from the worktree they act on — the primary-worktree fallback is gone, and `post-remove` reads the removed worktree's config (snapshotted before removal). Approval prompts collect hook commands from the same worktree. Breaking change for setups that relied on the primary-worktree fallback; the changelog entry has the recovery action. (#2690, #2703, #2714, #2717, #2701, #2708, #2727, #2736, #2748) - The `wt switch` picker's `alt-r` removal no longer runs unapproved project hooks (#2746) — the picker's removal path is now routed through `handle_remove_output` and consults the existing approval state read-only. - `wt config alias show` with no name lists every alias's full definition (#2684, #2691); `wt --help` switches to a compact aliases pointer (#2688). - `wt list --branches` warm-run perf: SHA-keyed cache for `main↕` and `Remote⇅` ahead/behind counts; shared push-remote URL and local-branch scan (#2704, #2718, #2673). - Claude Code plugin ships the `wt-switch-create` skill (#2737, thanks @onetom for #2631). See `CHANGELOG.md` for the full list (8 Improved, 5 Fixed, 5 Internal). semver-checks reports breaking library-API changes (new enum variants without `#[non_exhaustive]`, removed `Branch::github_push_url`, new trait method on `RemoteRefProvider`), which mandates at minimum a minor bump pre-1.0.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
wt list --branchescomputed branch ahead/behind counts with onegit for-each-ref --format='%(ahead-behind:BASE)'walk, on every invocation, in the post-skeleton setup scope. That walk is O(commits) — ~1.5s on rust-lang/rust — and it runs serially before the parallel task pool opens, because the per-branch tasks consume its result. So on a large real repo roughly 40% ofwt list's wall time was a single-threaded git graph walk that thread count couldn't help with (a follow-up from the picker thread-count investigation in #2683/#2685).Ahead/behind for
(base, branch)is a pure function of the two commit SHAs — content-addressed, never stale — exactly the shape of the existing.git/wt/cache/SHA-keyed caches. This adds anahead-behind/cache kind and rewires the population path:ahead-behind/{base_sha}-{head_sha}.jsonkind insha_cache.rs, identical machinery tois-ancestor/diff-stats(LRU-bounded, cleared bywt config state clear, wiped by the bench cache-invalidator).Repository::ahead_behind_by_shais now cache-backed: check →compute_ahead_behind→ write.RefSnapshot::capture_refs_with_ahead_behindbuilds the snapshot's ahead/behind map from the cache and only walks the graph for the branches the cache doesn't cover:AheadBehindTaskrecomputes (and caches) them by SHA in the parallel pool. (That task already runsmerge_base_by_shafor its orphan check, so the merge-basecompute_ahead_behindneeds is already primed — the fallback is just two cheaprev-list --countcalls.)wt config state clear) → one unscopedfor-each-ref %(ahead-behind:BASE_SHA) refs/heads/walk — a single shared traversal finds every merge-base — then results written to the cache.for-each-ref %(ahead-behind:BASE_SHA)scoped to just the missed refnames (git for-each-reftakes a ref list, so it's still one shared traversal, only over the cold subset), results written to the cache.The few-vs-many threshold (
AHEAD_BEHIND_SCOPED_BATCH_MIN_MISSES) trades the per-pair path's parallelism (it runs in the pool) against the batch's lower total work (one shared base-history traversal); a batch here is serial, so it's only worth it once "many" misses make the amortization clear.Cache-correctness details (Codex-prompted)
A few sharp edges, surfaced by
/review-codex, that the seeding path now handles explicitly:%(ahead-behind)computed against the resolved SHA, not the refname. A tag namedmainshadows the branch in git's ref resolution; passingbase_shato the batch makes it inert to that. The fallback path (default branch isn't a local/remote-tracking ref) still passes the refname and caches nothing.%(objectname), not a separately-scanned value. If arefs/heads/Xmoved betweenscan_localsand the batch (a sub-ms window, but possible), the counts are for the new SHA — and the cache now stores them under that SHA, not under the stalerb.commit_sha.(0, 0)before writing. Git's%(ahead-behind:BASE)for an orphan (no common ancestor) prints the two disjoint history sizes;compute_ahead_behindreturns(0, 0)and signals orphan-ness separately. An orphan'sbehindcount equals base's total commit count, so onegit rev-list --count base_shaon the seeding path detects them. The cache invariant — cache hit equals what a miss would recompute — holds for orphans now.write_with_lruscans the cache directory on every call to enforce the LRU bound; the serial setup-scope seeding loop calls it N times.put_ahead_behind_bulkwrites all N entries first and sweeps once at the end.Net effect
wt liston a large repofor-each-ref %(ahead-behind)in the setup scopewt listafter committing on one branchrev-list --count, in the pool); the rest stay cachedfor-each-ref %(ahead-behind)scoped to just the moved onesstate clear)WORKTRUNK_SKELETON_ONLY)No user-facing behavior change — same numbers, same columns. The picker (
wt switch) shares this path, so its preview pre-compute benefits identically.Follow-ups left as TODOs in the code
TODO(ahead-behind-pool)(src/commands/list/collect/mod.rs): the cold-cache%(ahead-behind)walk still runs serially in the setup scope, blocking the task pool from opening. Nothing downstream of work-item generation needs the counts (only the per-rowAheadBehindTask, which has a per-SHA fallback) — only the cheap ref scan does. So the walk could become a single work item in the pool, overlapping the other ~N workers. Needs an inter-task dependency the work-item model doesn't have today.TODO(remote-ahead-behind-batch)(src/commands/list/collect/tasks.rs): theRemote⇅column's per-branchahead_behind_by_sha(upstream, branch)is already cache-backed by this change, but has no cold-start batch primer. Onefor-each-ref --format='%(refname) %(upstream:track,nobracket)' refs/heads/would fill theahead-behind/cache for it in a single walk, mirroring what%(ahead-behind:main)does formain↕.Testing
sha_cache.rs: round-trip + cache-read tests for the new kind;test_clear_all_covers_all_kindsextended.ref_snapshot.rs: all-cold→unscoped-batch + cold→warm second-capture reads the persistent cache; partial-warm (few misses) omits the moved branch; many-misses uses the scoped batch; orphan normalizes to(0,0)in both the snapshot and the cache file; remote-tracking base resolves; unresolvable base degrades cleanly.wt list --branchespopulates.git/wt/cache/ahead-behind/; tampering an entry → the next run shows the tampered counts (proves the read path)./review-codex— second pass clean.cargo run -- hook pre-merge --yes— full suite + lints + doc-sync (3595 tests).Out of scope (flagged, not done here)
docs/dev/cache-staleness.mdstill references the in-memoryRepoCache.ahead_behindfield andbatch_ahead_behind— both removed whenRefSnapshotlanded. Pre-existing staleness; rewriting that doc is a separate cleanup. (The pre-skeletongit log --no-walkcommit-details batch is also content-addressed by SHA and could in principle be cache-first, but it's already O(refs) and fast — not worth a TODO.)🤖 Generated with Claude Code