feat: capture context-intelligence design knowledge into the mode (eval-driven)#33
Open
colombod wants to merge 4 commits into
Open
feat: capture context-intelligence design knowledge into the mode (eval-driven)#33colombod wants to merge 4 commits into
colombod wants to merge 4 commits into
Conversation
added 4 commits
June 7, 2026 09:28
Move the 6 defensive-navigation rules verbatim from session-navigator into context/navigation-budget-discipline.md (authoritative source). session-navigator now @mentions it (loading) and re-points its three in-document references; the always-on awareness file gets a single non-loading pointer row. No rule content changed; always-on behavior untouched (lean default preserved).
Enrich tool-design with R1 (module vs CLI by consumer, pointing to Standing Rule 3), R2 (narrow-domain specialization), R3 (progressive discovery → navigation discipline), plus an event-semantics guard. Add a new context-intelligence-evaluation-methodology skill (metric design, precursor metrics, A/B + statistical-N; points to eval-design and digital-twin-universe, never restating DTU-as-default or artifact-as-success). Add a thin context-intelligence-strategy.md pointer table (non-loading references; names the event-semantics principle once). Wire the strategy file via the mode's contributes.context and the eval skill via contributes.skills. Extend the eval-design catalog with structural scenarios 8-10 and behavioral Scenario C. Always-on behavior untouched.
6a: add PRE/POST-delegation constraints to the mode's file-not-found routing row
(no preamble before delegate(); relay the facilitator's Part-A question verbatim).
6b: add a Phase-0 RE-ANCHOR rule so off-script user replies are treated as signal
fragments and the opening question is re-asked, instead of breaking role.
6c: add a 'Pipeline ownership' standing rule to the facilitator countering the
hooks-skills-visibility leak of brainstorming/using-superpowers mandates — no
/brainstorm or /systems-design punt; the pipeline is self-contained from Phase 0.
No design-philosophy change; edge-case hardening only. Always-on behavior untouched.
7a: add a seeded-path routing row to the mode — when the activation message already
contains a clear goal and domain-concepts.md is absent, delegate with
seed_statement="<verbatim user goal>" (context_depth=none).
7b: add a facilitator 'Seeded entry' variant at the top of Phase 0 — treat the seed as
the pre-answered Part A, skip the opening question, run the Part-B probe, then open
with a data-grounded candidate framed on the seed.
Additive new path (does not change the interactive path). Always-on behavior untouched.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this adds
Captures the institutional design knowledge for context-intelligence tooling into the mode itself, so it produces deeper, more complete designs and drives its design pipeline more robustly.
Design depth
context/navigation-budget-discipline.md) and referenced bysession-navigatorvia@mention— one source of truth for keeping disk navigation within a context budget (no duplication).Interactive & autonomous driving
How the evaluation framework shaped this
The work was driven and validated by an outcome-eval harness with three scenarios (a pre-seeded design run, a multi-turn simulated user, and a one-shot). The evals did more than verify at the end — the baselines reshaped the scope, showing the mode's design depth was the highest-leverage gap. Every change maps to a scenario that measures it, so we built what we set out to build and can show it.
Measured evidence
Notes
Markdown/YAML prompt-and-config only; validation is via the eval scenarios (prompt content is validated behaviorally, not by unit tests).