Skip to content

feat: provenance + staleness detection + archive flow (#7 phases 1-3)#15

Merged
daveangulo merged 2 commits into
mainfrom
feature/staleness-detection-7a
May 5, 2026
Merged

feat: provenance + staleness detection + archive flow (#7 phases 1-3)#15
daveangulo merged 2 commits into
mainfrom
feature/staleness-detection-7a

Conversation

@daveangulo
Copy link
Copy Markdown
Owner

Summary

PR A of two for #7 (scope cleanup / GC). Server-only — no plugin bump. Phase 4 (branch-merge sweep) lands in PR B; LLM-judged semantic review gets its own follow-up issue.

  • Provenance stamping on every BlackboardEngine.post() and DecisionEngine.decide() — captures { recorded_at, branch?, commit_sha? } synchronously via git rev-parse. Detached HEAD and non-git directories are tolerated. Optional field; backwards compatible.
  • Staleness detection (src/engine/staleness.ts) — three deterministic signals (scope_path_missing, affected_files_missing, branch_gone), max() scoring, threshold 0.95 configurable via housekeeping.staleness_threshold in config.yml. Branch-gone auto-neutralizes when branch enumeration fails so non-git projects don't false-flag.
  • Review flowtwining_housekeeping gains staleness_review: true (returns candidates only). New tool twining_archive_stale({ ids, reason? }) archives caller-confirmed IDs. Decisions move to a new archived status (excluded from assemble/why/verify); blackboard entries are dismissed. A finding posts the audit trail.

What's NOT in this PR

  • Phase 4 — branch-merge sweep (auto-archive entries from deleted branches). PR B.
  • LLM-judged content staleness ("HMS Lancaster / Wave 3" case from the issue thread). Filed as separate issue — needs model-client wiring.
  • Per-decision sunset overrides. Deferred until the global threshold proves too aggressive in practice.

Test plan

  • npm test — 887 passing, 19 new (provenance unit, staleness unit, housekeeping integration)
  • npm run build — clean
  • Manual: run twining_housekeeping({ staleness_review: true }) against a repo with a deleted feature branch and confirm originating decisions are flagged
  • Manual: run twining_archive_stale({ ids: [...] }) and confirm decision shows status: archived and is excluded from twining_why
  • Manual: confirm non-git directory doesn't false-flag (branch-gone neutralization)

Refs

🤖 Generated with Claude Code

daveangulo and others added 2 commits May 5, 2026 09:07
…ases 1-3)

PR A of two for #7. Server-only — no plugin bump.

## Phase 1: provenance stamping
Every BlackboardEngine.post() and DecisionEngine.decide() now captures
{ recorded_at, branch?, commit_sha? } via git rev-parse and stores it as
the optional `provenance` field on the entry/decision. Detached HEAD and
non-git directories are tolerated (fields omitted). Backwards compatible:
existing entries lack the field; readers must treat it as optional.

## Phase 2: staleness detection
New `src/engine/staleness.ts` scores items on three deterministic signals:

- scope_path_missing: scope looks like a path and no longer exists on disk
- affected_files_missing: proportion of affected_files removed (0..1)
- branch_gone: provenance.branch is no longer in `git for-each-ref refs/heads`

Final score is max() across signals; threshold default 0.95, configurable
via `housekeeping.staleness_threshold` in config.yml. Branch-gone is
auto-neutralized when branch enumeration fails (non-git projects) so the
signal never false-flags.

LLM-judged "content references concepts that don't exist anymore"
(LannyRipple's HMS Lancaster / Wave 3 case) is *out of scope* here — see
follow-up issue.

## Phase 3: review flow
- twining_housekeeping gains `staleness_review: true` flag — returns the
  candidate list with provenance + score + reasons; does NOT act.
- New tool `twining_archive_stale({ ids, reason? })` — archives the
  caller-confirmed list. Decisions move to `archived` status (excluded
  from assemble/why/verify); blackboard entries are dismissed. Posts a
  finding summarizing what was archived for the audit trail.
- DecisionStatus gains `archived` alongside active/provisional/superseded/overridden.

887 tests pass (19 new). Phase 4 (branch-merge sweep) and the LLM-judged
semantic-review path will be follow-up PR B and a new issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues from the independent review pass on PR A:

1. staleness.ts — the knownBranchesEmpty heuristic intended to neutralize
   branch_gone when branch enumeration fails actually fired whenever
   decisions[] was empty (because decisions[0]?.provenance?.branch ?? ""
   resolves to "", which is never in the known-branch set). Result:
   blackboard-only audits in healthy git repos silently dropped the
   branch_gone signal. Replaced with an explicit null sentinel from
   listLocalBranches() — distinguishes "enumeration failed" from
   "legitimately empty set". Regression test added covering the
   blackboard-only path.

2. analytics-engine.ts + ValueStats type — decision_lifecycle had no
   archived bucket, so archived decisions were silently dropped from the
   breakdown after the new status was added. Added the bucket.

3. provenance.ts — the proposed `git rev-parse --abbrev-ref HEAD HEAD`
   single-call optimization was wrong: --abbrev-ref is a global flag
   that applies to both refs, so the call returned the branch name
   twice. Reverted to two calls with a comment about deferring further
   optimization until profiling justifies it.

888 tests pass (1 new regression test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@daveangulo daveangulo merged commit e3cd5f3 into main May 5, 2026
5 checks passed
@daveangulo daveangulo deleted the feature/staleness-detection-7a branch May 5, 2026 17:14
daveangulo added a commit that referenced this pull request May 5, 2026
* feat: branch-merge sweep (#7 phase 4 — PR B)

PR B for #7. Server-only — no plugin bump.

twining_housekeeping gains a `merge_sweep: true` flag that snapshots the
local branch set in .twining/.last-known-branches.json on first call and
diffs against it on subsequent calls. Entries / decisions whose
`provenance.branch` is in the "previously known, now gone" set are
returned as candidates. Pass the IDs to twining_archive_stale to act —
human-in-the-loop is preserved (no auto-archival).

Why detect by snapshot diff instead of `git branch --merged main`?
Because once a branch is deleted you can't reconstruct whether it was
merged or force-deleted, and the action ("archive entries from it") is
the same in both cases. The snapshot diff catches both.

Edge cases covered:
- Initial run: records snapshot, returns initial_record=true, no candidates.
- Non-git directory: enumerated=false, no state file written, no flags.
- Corrupted state file: treated as a fresh initial run.
- Newly-added branches: not flagged; only deletions matter.

Includes a small refactor: listLocalBranches moved from staleness.ts to
src/utils/git-branches.ts so both staleness and branch-watcher share
the null-sentinel-on-failure semantics established in PR A's review fixes.

897 tests pass (9 new — branch-watcher unit + merge_sweep integration).

Together with PR #15, this closes the deterministic portion of #7.
The LLM-judged semantic-content review (the "Wave 3" / "HMS Lancaster"
case) is tracked separately in #16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* review: dry-run snapshot guard + dedupe + preview test (#17)

Three issues from the independent review pass on PR B:

1. detectDeletedBranches() unconditionally wrote the snapshot file — even
   when housekeeping was running with execute=false. A preview pass
   silently consumed deletions, so any branch deleted between a preview
   and the eventual execute call would be missed. Fixed by threading a
   `commit` boolean through; the housekeeping engine passes through its
   own `execute` flag. Replaced the previous "first sweep records baseline"
   integration test with a regression test that runs two consecutive
   previews and asserts the deletion stays visible.

2. When both staleness_review and merge_sweep run in the same call, an
   entry from a recently-deleted branch was flagged twice — once by the
   branch_gone signal in staleness_review, once by merge_sweep. Same ID,
   different framing. Now dedupe staleness_review candidates whose IDs
   are already in merge_sweep candidates (merge_sweep wins; it's the
   more specific signal). Tool description updated.

3. Test gap: no coverage of the preview-doesn't-advance behavior.
   Added the missing test, plus a unit test in branch-watcher.test.ts
   covering the same property at the function level.

The reviewer also flagged a potential "archived items reappear in next
sweep" failure mode. Investigated: blackboardStore.dismiss() is a hard
delete (not soft), and decisions are filtered by status==='active' in
merge_sweep. So the failure mode doesn't apply. Did not add the
proposed test because it overreached — it required setting up state
that fights with twining_housekeeping(execute=true)'s default
auto-archive of non-decision blackboard entries. The unit-level
guarantees plus the existing twining_archive_stale test cover the
property.

899 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daveangulo added a commit that referenced this pull request May 20, 2026
Status posts and metric entries accumulated during the staleness-detection
+ branch-merge-sweep work session. No new decisions (those were committed
inside #15 / #17).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scope cleanup - low priority

1 participant