feat: provenance + staleness detection + archive flow (#7 phases 1-3) by daveangulo · Pull Request #15 · daveangulo/twining-mcp

daveangulo · 2026-05-05T16:09:17Z

Summary

PR A of two for #7 (scope cleanup / GC). Server-only — no plugin bump. Phase 4 (branch-merge sweep) lands in PR B; LLM-judged semantic review gets its own follow-up issue.

Provenance stamping on every BlackboardEngine.post() and DecisionEngine.decide() — captures { recorded_at, branch?, commit_sha? } synchronously via git rev-parse. Detached HEAD and non-git directories are tolerated. Optional field; backwards compatible.
Staleness detection (src/engine/staleness.ts) — three deterministic signals (scope_path_missing, affected_files_missing, branch_gone), max() scoring, threshold 0.95 configurable via housekeeping.staleness_threshold in config.yml. Branch-gone auto-neutralizes when branch enumeration fails so non-git projects don't false-flag.
Review flow — twining_housekeeping gains staleness_review: true (returns candidates only). New tool twining_archive_stale({ ids, reason? }) archives caller-confirmed IDs. Decisions move to a new archived status (excluded from assemble/why/verify); blackboard entries are dismissed. A finding posts the audit trail.

What's NOT in this PR

Phase 4 — branch-merge sweep (auto-archive entries from deleted branches). PR B.
LLM-judged content staleness ("HMS Lancaster / Wave 3" case from the issue thread). Filed as separate issue — needs model-client wiring.
Per-decision sunset overrides. Deferred until the global threshold proves too aggressive in practice.

Test plan

npm test — 887 passing, 19 new (provenance unit, staleness unit, housekeeping integration)
npm run build — clean
Manual: run twining_housekeeping({ staleness_review: true }) against a repo with a deleted feature branch and confirm originating decisions are flagged
Manual: run twining_archive_stale({ ids: [...] }) and confirm decision shows status: archived and is excluded from twining_why
Manual: confirm non-git directory doesn't false-flag (branch-gone neutralization)

Refs

Closes Scope cleanup - low priority #7 partially (this PR + PR B together)
Recorded design decisions in .twining/decisions/01KQWE98*.json

🤖 Generated with Claude Code

…ases 1-3) PR A of two for #7. Server-only — no plugin bump. ## Phase 1: provenance stamping Every BlackboardEngine.post() and DecisionEngine.decide() now captures { recorded_at, branch?, commit_sha? } via git rev-parse and stores it as the optional `provenance` field on the entry/decision. Detached HEAD and non-git directories are tolerated (fields omitted). Backwards compatible: existing entries lack the field; readers must treat it as optional. ## Phase 2: staleness detection New `src/engine/staleness.ts` scores items on three deterministic signals: - scope_path_missing: scope looks like a path and no longer exists on disk - affected_files_missing: proportion of affected_files removed (0..1) - branch_gone: provenance.branch is no longer in `git for-each-ref refs/heads` Final score is max() across signals; threshold default 0.95, configurable via `housekeeping.staleness_threshold` in config.yml. Branch-gone is auto-neutralized when branch enumeration fails (non-git projects) so the signal never false-flags. LLM-judged "content references concepts that don't exist anymore" (LannyRipple's HMS Lancaster / Wave 3 case) is *out of scope* here — see follow-up issue. ## Phase 3: review flow - twining_housekeeping gains `staleness_review: true` flag — returns the candidate list with provenance + score + reasons; does NOT act. - New tool `twining_archive_stale({ ids, reason? })` — archives the caller-confirmed list. Decisions move to `archived` status (excluded from assemble/why/verify); blackboard entries are dismissed. Posts a finding summarizing what was archived for the audit trail. - DecisionStatus gains `archived` alongside active/provisional/superseded/overridden. 887 tests pass (19 new). Phase 4 (branch-merge sweep) and the LLM-judged semantic-review path will be follow-up PR B and a new issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three issues from the independent review pass on PR A: 1. staleness.ts — the knownBranchesEmpty heuristic intended to neutralize branch_gone when branch enumeration fails actually fired whenever decisions[] was empty (because decisions[0]?.provenance?.branch ?? "" resolves to "", which is never in the known-branch set). Result: blackboard-only audits in healthy git repos silently dropped the branch_gone signal. Replaced with an explicit null sentinel from listLocalBranches() — distinguishes "enumeration failed" from "legitimately empty set". Regression test added covering the blackboard-only path. 2. analytics-engine.ts + ValueStats type — decision_lifecycle had no archived bucket, so archived decisions were silently dropped from the breakdown after the new status was added. Added the bucket. 3. provenance.ts — the proposed `git rev-parse --abbrev-ref HEAD HEAD` single-call optimization was wrong: --abbrev-ref is a global flag that applies to both refs, so the call returned the branch name twice. Reverted to two calls with a comment about deferring further optimization until profiling justifies it. 888 tests pass (1 new regression test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: branch-merge sweep (#7 phase 4 — PR B) PR B for #7. Server-only — no plugin bump. twining_housekeeping gains a `merge_sweep: true` flag that snapshots the local branch set in .twining/.last-known-branches.json on first call and diffs against it on subsequent calls. Entries / decisions whose `provenance.branch` is in the "previously known, now gone" set are returned as candidates. Pass the IDs to twining_archive_stale to act — human-in-the-loop is preserved (no auto-archival). Why detect by snapshot diff instead of `git branch --merged main`? Because once a branch is deleted you can't reconstruct whether it was merged or force-deleted, and the action ("archive entries from it") is the same in both cases. The snapshot diff catches both. Edge cases covered: - Initial run: records snapshot, returns initial_record=true, no candidates. - Non-git directory: enumerated=false, no state file written, no flags. - Corrupted state file: treated as a fresh initial run. - Newly-added branches: not flagged; only deletions matter. Includes a small refactor: listLocalBranches moved from staleness.ts to src/utils/git-branches.ts so both staleness and branch-watcher share the null-sentinel-on-failure semantics established in PR A's review fixes. 897 tests pass (9 new — branch-watcher unit + merge_sweep integration). Together with PR #15, this closes the deterministic portion of #7. The LLM-judged semantic-content review (the "Wave 3" / "HMS Lancaster" case) is tracked separately in #16. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review: dry-run snapshot guard + dedupe + preview test (#17) Three issues from the independent review pass on PR B: 1. detectDeletedBranches() unconditionally wrote the snapshot file — even when housekeeping was running with execute=false. A preview pass silently consumed deletions, so any branch deleted between a preview and the eventual execute call would be missed. Fixed by threading a `commit` boolean through; the housekeeping engine passes through its own `execute` flag. Replaced the previous "first sweep records baseline" integration test with a regression test that runs two consecutive previews and asserts the deletion stays visible. 2. When both staleness_review and merge_sweep run in the same call, an entry from a recently-deleted branch was flagged twice — once by the branch_gone signal in staleness_review, once by merge_sweep. Same ID, different framing. Now dedupe staleness_review candidates whose IDs are already in merge_sweep candidates (merge_sweep wins; it's the more specific signal). Tool description updated. 3. Test gap: no coverage of the preview-doesn't-advance behavior. Added the missing test, plus a unit test in branch-watcher.test.ts covering the same property at the function level. The reviewer also flagged a potential "archived items reappear in next sweep" failure mode. Investigated: blackboardStore.dismiss() is a hard delete (not soft), and decisions are filtered by status==='active' in merge_sweep. So the failure mode doesn't apply. Did not add the proposed test because it overreached — it required setting up state that fights with twining_housekeeping(execute=true)'s default auto-archive of non-decision blackboard entries. The unit-level guarantees plus the existing twining_archive_stale test cover the property. 899 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Status posts and metric entries accumulated during the staleness-detection + branch-merge-sweep work session. No new decisions (those were committed inside #15 / #17). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

daveangulo and others added 2 commits May 5, 2026 09:07

daveangulo merged commit e3cd5f3 into main May 5, 2026
5 checks passed

daveangulo deleted the feature/staleness-detection-7a branch May 5, 2026 17:14

This was referenced May 5, 2026

Semantic-content staleness review (LLM-judged) — deferred from #7 #16

Open

feat: branch-merge sweep (#7 phase 4 — PR B) #17

Merged

daveangulo mentioned this pull request May 5, 2026

Scope cleanup - low priority #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: provenance + staleness detection + archive flow (#7 phases 1-3)#15

feat: provenance + staleness detection + archive flow (#7 phases 1-3)#15
daveangulo merged 2 commits into
mainfrom
feature/staleness-detection-7a

daveangulo commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

daveangulo commented May 5, 2026

Summary

What's NOT in this PR

Test plan

Refs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant