Skip to content

feat: marque MVP foundation — spec, codegen, engine pipeline#3

Merged
bashandbone merged 5 commits into
mainfrom
001-marque-mvp
Apr 11, 2026
Merged

feat: marque MVP foundation — spec, codegen, engine pipeline#3
bashandbone merged 5 commits into
mainfrom
001-marque-mvp

Conversation

@bashandbone
Copy link
Copy Markdown
Collaborator

@bashandbone bashandbone commented Apr 11, 2026

Summary

Lands the foundational work for the marque MVP (branch 001-marque-mvp): the ratified feature spec, Phase 1 workspace restructure (new marque-ism leaf crate + dev scaffolding), and Phase 2 (ODNI CVE XML→Rust codegen, typed IsmAttributes, FixProposal/AppliedFix type split, FR-016 deterministic fix ordering with overlap guard, layered config with hard-fail validators, and a Clock-injected engine). Unblocks Phase 3 (US1 rules + CLI) where real spans will be wired through the scanner.

Changes

Spec & contracts (51c4aa7 — 17 expert-panel findings)

  • Audit-record reconciled with AppliedFix (input, source enum, schema_version)

  • FR-005a NDJSON atomic-write contract with error-frame recovery

  • FR-016 total-order fix application: (span.end DESC, span.start DESC, rule_id ASC, replacement ASC)

  • US2/US3 acceptance scenarios rewritten with literal-byte fixtures

  • SC-001 scoped to native; SC-001b advisory WASM latency bar

  • FR-008 severity=off invariant: unrepresentable on emitted diagnostics

  • FR-010 (committed [user] refusal) and FR-011 (schema version match) validators

Phase 1 — workspace restructure (71e1f11)

  • New marque-ism leaf crate owning Span, IsmAttributes, TokenSet, CapcoTokenSet, build.rs codegen, and ODNI schemas (resolves the circular dependency between marque-core and marque-capco)

  • marque-rules depends directly on marque-ism

  • marque-core retains thin re-exports for back-compat

  • Test corpus scaffolding (CORPUS_CONTRACT.md, CORPUS_PROVENANCE.md)

  • Criterion benchmark skeletons; scripts/check.sh; .config/nextest.toml

Phase 1 review fixes (a46086b — critical/high/medium)

  • Guard Engine::fix against placeholder Span(0,0) (output-corruption prevention)

  • Filter remaining_diagnostics to exclude already-applied fixes

  • Span::as_str returns Result instead of panicking

  • CoreError::InvalidUtf8, proper CRLF handling, full CAB scan, dedup CVE_TOKENS

  • PartialOrd/Ord on RuleId for FR-016 sort

Phase 2 — foundational crates (dacea46)

  • Real ODNI CVE XML and XSD parsing → typed Rust enums (SciControl, DissemControl, SarIdentifier, DeclassExemption, ExemptFrom) + TRIGRAPHS + sorted/deduplicated ALL_CVE_TOKENS + validators + migrations, with T010 schema-version pinning

  • marque-rules: FixProposal (pure data) vs AppliedFix (engine-only promotion via __engine_promote); FixSource enum; Severity::Off as first variant; Diagnostic carries citation

  • marque-engine: Engine with Clock injection, FixMode::{Apply, DryRun}, config-driven severity overrides, FR-016 sort + overlap guard (reverse-walk window) + HashSet<(RuleId, Span)> for O(1) remaining_diagnostics filter

  • marque-config: layered load with FR-010/FR-011 hard-fail, confidence_threshold validation, MARQUE_CONFIDENCE_THRESHOLD env binding

  • marque-capco: E001 banner abbreviation, E002 missing/misordered USA trigraph, E004 separator count (stub), E005 declass in banner

  • marque-core/parser: uses generated enum parse() for SCI/dissem/SAR/declass; Classification remains hand-written (portion + banner forms)

  • New Clock trait + SystemClock/FixedClock for deterministic audit timestamps in tests

  • Engine::fix_with_threshold plumbs --confidence CLI flag and server FixRequest.confidence_threshold through with InvalidThreshold validation

  • Server now calls marque_config::load() (runs FR-011 check instead of Config::default() fallback)

  • Silent-fallback cluster fixed: MARQUE_CONFIDENCE_THRESHOLD parse failure → ConfigError::InvalidEnvVar; rule severity string typos → ConfigError::UnknownSeverity; current_dir() failure → exit 74

  • Safety hardening: Trigraph inner field private with try_new ASCII validator (infallible as_str via from_utf8_unchecked + SAFETY comment); RuleId/FixedClock fields private with new constructors; FixProposal::new confidence check is release-mode assert! (not debug_assert!); Span::new release-mode assert + try_as_slice checked variant

  • Ergonomics/perf: AppliedFix.classifier_id and .input are Arc<str> for O(1) clone; BatchEngine.lint_many/fix_many yield Result<_, BatchError> instead of aborting on JoinError; Severity has Display/as_str (server/CLI/wasm stop using format!(\"{:?}\")); canonicalize uses binary_search over sorted ALL_CVE_TOKENS

  • Build-script hardening: resolve_idents asserts generated enum variants are non-empty and unique; XML/XSD unescape failures now panic with file context instead of silently truncating

Files Changed

820 files changed, +7612 / -853.

The bulk of the file count is the ODNI ISM v2022-DEC schema bundle (CVE XML, XSD, Schematron, RNG) relocated into crates/marque-ism/schemas/ISM-v2022-DEC/ during Phase 1. The source/test churn is concentrated in crates/marque-{ism,rules,engine,capco,config,core,server,wasm}/src/ and marque/src/main.rs.

| Area | Status |

|---|---|

| specs/001-marque-mvp/ (spec, contracts, data model, tasks) | Added/Modified |

| crates/marque-ism/ (new crate + schemas + build.rs codegen) | Added |

| crates/marque-{rules,engine,capco,config,core}/src/ | Modified |

| crates/marque-{server,wasm}/src/ | Modified |

| marque/src/main.rs (CLI) | Modified |

| crates/marque-engine/src/clock.rs | Added |

| tests/corpus/ (contract + provenance + fixture scaffolding) | Added |

| scripts/check.sh, .config/nextest.toml, benches/ | Added |

Testing

  • Unit tests: 45 new tests across marque-config (8), marque-engine (7), marque-ism (17), marque-rules (10), plus 3 existing marque-core scanner tests — zero failures

  • Engine coverage: FR-016 reverse-order application, C-1 overlap guard, FixMode::DryRun, fix_with_threshold validation (NaN, out-of-range, boundaries), FixedClock determinism

  • Config coverage: set_confidence_threshold boundaries + NaN rejection, rule severity string validation (case-sensitivity, empty, typos), exit_code() matrix

  • Type invariants: Trigraph::try_new ASCII enforcement, Span::new panic on inverted bounds, Span::try_as_slice bounds safety, FixProposal::new release-mode confidence assert, generated CVE token sorted-and-unique invariant

  • Static checks: cargo clippy --workspace --all-targets -- -D warnings clean; cargo fmt --check clean

  • Build: cargo check --workspace and cargo build --workspace both succeed

Test plan

  • CI runs cargo check --workspace

  • CI runs cargo clippy --workspace --all-targets -- -D warnings

  • CI runs cargo test --workspace (or cargo nextest run --workspace)

  • CI runs cargo fmt --check

  • Manual: marque check <file> on a fixture with a banner marking exits 0 when clean

  • Manual: marque fix --dry-run <file> reports would-apply count without mutating source

  • Manual: MARQUE_CONFIDENCE_THRESHOLD=bananas marque check x exits 65 with clear error

  • Manual: .marque.toml with banner-abbreviation = "err" (typo) exits 65 with clear error

Related Issues

None linked.

Notes for Reviewers

  • PR size: This is a large PR because it bundles the ratified spec, Phase 1 workspace restructure, and Phase 2 foundational crates. The schema file count inflates the stat; the code review surface is the crates/marque-*/src/ and marque/src/main.rs diffs, plus build.rs.

  • Phase 2 review artifact: The full local review report (4 CRITICAL + 10 HIGH + 10 MEDIUM + 9 LOW) is at .claude/PRPs/reviews/phase-2-local-review.md in the working tree (not committed). All 14 prioritized findings are resolved in the Phase 2 commit — see the commit body for the full mapping.

  • Known Phase 3 TODOs left in code: RuleContext zone/position are hardcoded to Body with a TODO(phase-3) marker; rule fixes emit Span::new(0, 0) placeholders that are filtered by the engine's is_empty() guard until Phase 3 wires real per-token spans from the scanner. The C-1 overlap guard landed now specifically to protect Phase 3's real spans.

🤖 Generated with Claude Code

…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfeat: implement Phase 2 — CVE codegen, engine pipeline, validators�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mPhase 2 (Foundational) lands the real ODNI CVE XML → Rust codegen,�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231mthe FixProposal/AppliedFix type split, FR-016 deterministic fix�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231mordering with overlap guard, layered config with hard-fail�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231mvalidators, and a Clock-injected engine for deterministic audit�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231mtests. Also addresses 14 review findings from the Phase 2�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231mlocal review (4 CRITICAL + 10 HIGH + 10 MEDIUM + 9 LOW).�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231mPhase 2 core work:�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m �[38;5;231m- marque-ism/build.rs: real CVE XML and XSD parsing producing�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231m  typed enums (SciControl, DissemControl, SarIdentifier,�[0m
�[38;5;238m  13�[0m �[38;5;238m│�[0m �[38;5;231m  DeclassExemption, ExemptFrom) + TRIGRAPHS + ALL_CVE_TOKENS +�[0m
�[38;5;238m  14�[0m �[38;5;238m│�[0m �[38;5;231m  validators + migrations, with T010 schema-version pinning�[0m
�[38;5;238m  15�[0m �[38;5;238m│�[0m �[38;5;231m- marque-rules: FixProposal (pure, timestamp-free) vs AppliedFix�[0m
�[38;5;238m  16�[0m �[38;5;238m│�[0m �[38;5;231m  (engine-only promotion via __engine_promote), FixSource enum,�[0m
�[38;5;238m  17�[0m �[38;5;238m│�[0m �[38;5;231m  Severity::Off as first variant, Diagnostic carries citation�[0m
�[38;5;238m  18�[0m �[38;5;238m│�[0m �[38;5;231m- marque-engine: Engine with Clock injection, FixMode::{Apply,�[0m
�[38;5;238m  19�[0m �[38;5;238m│�[0m �[38;5;231m  DryRun}, Config-driven severity overrides, FR-016 sort�[0m
�[38;5;238m  20�[0m �[38;5;238m│�[0m �[38;5;231m- marque-config: layered load with FR-010 (committed [user]�[0m
�[38;5;238m  21�[0m �[38;5;238m│�[0m �[38;5;231m  refusal), FR-011 (schema version match), confidence_threshold�[0m
�[38;5;238m  22�[0m �[38;5;238m│�[0m �[38;5;231m  validation, MARQUE_CONFIDENCE_THRESHOLD env binding�[0m
�[38;5;238m  23�[0m �[38;5;238m│�[0m �[38;5;231m- marque-capco: E001 banner abbreviation, E002 missing/misordered�[0m
�[38;5;238m  24�[0m �[38;5;238m│�[0m �[38;5;231m  USA trigraph, E004 separator count (stub), E005 declass in banner�[0m
�[38;5;238m  25�[0m �[38;5;238m│�[0m �[38;5;231m- marque-core/parser: uses generated enum parse() methods for�[0m
�[38;5;238m  26�[0m �[38;5;238m│�[0m �[38;5;231m  SCI/dissem/SAR/declass; Classification remains hand-written�[0m
�[38;5;238m  27�[0m �[38;5;238m│�[0m �[38;5;231m- Clock trait + SystemClock/FixedClock for deterministic audit�[0m
�[38;5;238m  28�[0m �[38;5;238m│�[0m �[38;5;231m  timestamps in tests�[0m
�[38;5;238m  29�[0m �[38;5;238m│�[0m
�[38;5;238m  30�[0m �[38;5;238m│�[0m �[38;5;231mCritical review findings:�[0m
�[38;5;238m  31�[0m �[38;5;238m│�[0m �[38;5;231m- C-1 Overlap guard in Engine::fix_inner drops overlapping fixes�[0m
�[38;5;238m  32�[0m �[38;5;238m│�[0m �[38;5;231m  after FR-016 sort; dropped fixes surface in remaining_diagnostics�[0m
�[38;5;238m  33�[0m �[38;5;238m│�[0m �[38;5;231m  so they cannot be silently lost�[0m
�[38;5;238m  34�[0m �[38;5;238m│�[0m �[38;5;231m- C-2 MARQUE_CONFIDENCE_THRESHOLD parse failure returns�[0m
�[38;5;238m  35�[0m �[38;5;238m│�[0m �[38;5;231m  ConfigError::InvalidEnvVar instead of silently using default�[0m
�[38;5;238m  36�[0m �[38;5;238m│�[0m �[38;5;231m- C-3 Dropped latent dual-Classification ambiguity: build.rs no�[0m
�[38;5;238m  37�[0m �[38;5;238m│�[0m �[38;5;231m  longer emits a generated Classification enum (canonical lives�[0m
�[38;5;238m  38�[0m �[38;5;238m│�[0m �[38;5;231m  in attrs.rs); validator uses literal matches!�[0m
�[38;5;238m  39�[0m �[38;5;238m│�[0m �[38;5;231m- C-4 to_rust_ident resolves idents through resolve_idents which�[0m
�[38;5;238m  40�[0m �[38;5;238m│�[0m �[38;5;231m  asserts non-empty and unique variants at build time�[0m
�[38;5;238m  41�[0m �[38;5;238m│�[0m
�[38;5;238m  42�[0m �[38;5;238m│�[0m �[38;5;231mHigh review findings:�[0m
�[38;5;238m  43�[0m �[38;5;238m│�[0m �[38;5;231m- H-1 Server calls marque_config::load() and runs FR-011 check�[0m
�[38;5;238m  44�[0m �[38;5;238m│�[0m �[38;5;231m  instead of Config::default() fallback�[0m
�[38;5;238m  45�[0m �[38;5;238m│�[0m �[38;5;231m- H-2/H-3 Engine::fix_with_threshold plumbs --confidence CLI flag�[0m
�[38;5;238m  46�[0m �[38;5;238m│�[0m �[38;5;231m  and FixRequest.confidence_threshold; InvalidThreshold error type�[0m
�[38;5;238m  47�[0m �[38;5;238m│�[0m �[38;5;231m- H-4 CLI current_dir() failure → exit 74 (EX_IOERR); load�[0m
�[38;5;238m  48�[0m �[38;5;238m│�[0m �[38;5;231m  failures exit with ConfigError::exit_code instead of silent�[0m
�[38;5;238m  49�[0m �[38;5;238m│�[0m �[38;5;231m  Config::default fallback�[0m
�[38;5;238m  50�[0m �[38;5;238m│�[0m �[38;5;231m- H-5 Trigraph inner field private; try_new enforces ASCII�[0m
�[38;5;238m  51�[0m �[38;5;238m│�[0m �[38;5;231m  uppercase; as_str is infallible via from_utf8_unchecked with�[0m
�[38;5;238m  52�[0m �[38;5;238m│�[0m �[38;5;231m  a SAFETY comment documenting the invariant�[0m
�[38;5;238m  53�[0m �[38;5;238m│�[0m �[38;5;231m- H-6 marque-config validates rule severity strings at load time�[0m
�[38;5;238m  54�[0m �[38;5;238m│�[0m �[38;5;231m  via Severity::parse_config; ConfigError::UnknownSeverity on typo�[0m
�[38;5;238m  55�[0m �[38;5;238m│�[0m �[38;5;231m- H-7 applied_fixes is HashSet<(RuleId, Span)> for O(1) filter�[0m
�[38;5;238m  56�[0m �[38;5;238m│�[0m �[38;5;231m- H-8 ALL_CVE_TOKENS collected via BTreeSet (sorted, deduplicated)�[0m
�[38;5;238m  57�[0m �[38;5;238m│�[0m �[38;5;231m  so canonicalize uses binary_search; NOFORN/ORCON/PROPIN/IMCON�[0m
�[38;5;238m  58�[0m �[38;5;238m│�[0m �[38;5;231m  duplicates eliminated�[0m
�[38;5;238m  59�[0m �[38;5;238m│�[0m �[38;5;231m- H-9 parse_classification doc explains it is the single�[0m
�[38;5;238m  60�[0m �[38;5;238m│�[0m �[38;5;231m  deliberately hand-coded path (both portion and banner forms)�[0m
�[38;5;238m  61�[0m �[38;5;238m│�[0m �[38;5;231m- H-10 45 unit tests backfill core-crate coverage�[0m
�[38;5;238m  62�[0m �[38;5;238m│�[0m
�[38;5;238m  63�[0m �[38;5;238m│�[0m �[38;5;231mMedium/Low:�[0m
�[38;5;238m  64�[0m �[38;5;238m│�[0m �[38;5;231m- FixProposal::new confidence is assert! (release-mode), not�[0m
�[38;5;238m  65�[0m �[38;5;238m│�[0m �[38;5;231m  debug_assert — NaN and INFINITY are correctness bugs in release�[0m
�[38;5;238m  66�[0m �[38;5;238m│�[0m �[38;5;231m- Span::new upgraded to assert!; Span::try_as_slice checked variant�[0m
�[38;5;238m  67�[0m �[38;5;238m│�[0m �[38;5;231m- RuleContext zone/position carry TODO(phase-3) marker�[0m
�[38;5;238m  68�[0m �[38;5;238m│�[0m �[38;5;231m- AppliedFix classifier_id and input are Arc<str> for O(1) clone�[0m
�[38;5;238m  69�[0m �[38;5;238m│�[0m �[38;5;231m- BatchEngine lint_many/fix_many yield Result<_, BatchError>�[0m
�[38;5;238m  70�[0m �[38;5;238m│�[0m �[38;5;231m  instead of aborting the process on JoinError�[0m
�[38;5;238m  71�[0m �[38;5;238m│�[0m �[38;5;231m- Severity has as_str + Display; CLI/server/wasm use it instead�[0m
�[38;5;238m  72�[0m �[38;5;238m│�[0m �[38;5;231m  of format!("{:?}")�[0m
�[38;5;238m  73�[0m �[38;5;238m│�[0m �[38;5;231m- RuleId inner field private with new/as_str accessors�[0m
�[38;5;238m  74�[0m �[38;5;238m│�[0m �[38;5;231m- FixedClock tuple field private with new constructor�[0m
�[38;5;238m  75�[0m �[38;5;238m│�[0m �[38;5;231m- LintResult::fix_count requires d.fix.is_some() AND Severity::Fix�[0m
�[38;5;238m  76�[0m �[38;5;238m│�[0m �[38;5;231m- Eliminated double source.to_vec() allocation in FixMode::DryRun�[0m
�[38;5;238m  77�[0m �[38;5;238m│�[0m �[38;5;231m- make_fix_diagnostic takes FixDiagnosticParams struct�[0m
�[38;5;238m  78�[0m �[38;5;238m│�[0m �[38;5;231m- Removed unused thiserror/phf/anyhow runtime deps from marque-ism�[0m
�[38;5;238m  79�[0m �[38;5;238m│�[0m �[38;5;231m- marque_ism::SCHEMA_VERSION re-exported at crate root�[0m
�[38;5;238m  80�[0m �[38;5;238m│�[0m �[38;5;231m- run_metadata exits 69 (EX_UNAVAILABLE) instead of 0�[0m
�[38;5;238m  81�[0m �[38;5;238m│�[0m
�[38;5;238m  82�[0m �[38;5;238m│�[0m �[38;5;231mTests: 45 unit + 1 doctest, zero failures. Clippy clean across�[0m
�[38;5;238m  83�[0m �[38;5;238m│�[0m �[38;5;231m--all-targets. rustfmt clean.�[0m
�[38;5;238m  84�[0m �[38;5;238m│�[0m
�[38;5;238m  85�[0m �[38;5;238m│�[0m �[38;5;231mCo-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
Copilot AI review requested due to automatic review settings April 11, 2026 02:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Foundational MVP work for “marque” across spec/contracts, schema/codegen, core parsing types, config loading/validation, and engine fix application (ordering/overlap guards, clock injection), plus CLI/server/WASM plumbing updates.

Changes:

  • Introduces marque-ism with generated ODNI CVE-derived enums/tables and token utilities; updates core scanner/parser to use typed attributes and renamed candidate types.
  • Refactors rules/engine contracts (FixProposal vs AppliedFix, deterministic fix ordering, overlap guard, injected clock) and propagates API changes through CLI/server/WASM.
  • Implements layered config loading with hard-fail validators (schema pinning, severity/threshold validation) and improved error handling/exit codes.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
marque/src/main.rs CLI config loading hard-fail + fix_with_threshold wiring + JSON severity output updates
crates/marque-wasm/src/lib.rs WASM wrapper updates for new engine fix API + stable severity/message serialization
crates/marque-server/src/main.rs Server now loads layered config + request-level confidence override support
crates/marque-rules/src/lib.rs New contracts: RuleId constructor, Severity::Off, FixProposal/AppliedFix, Diagnostic::new
crates/marque-ism/src/token_set.rs Switches token data to build.rs-generated ALL_CVE_TOKENS/TRIGRAPHS + binary_search canonicalize
crates/marque-ism/src/span.rs Stronger span invariants + try_as_slice + candidate rename to MarkingCandidate
crates/marque-ism/src/lib.rs Re-exports generated CVE enums + schema version constant + updated span exports
crates/marque-ism/src/attrs.rs Typed CVE-derived attribute fields + hardened Trigraph construction/infallible as_str
crates/marque-ism/Cargo.toml Dependency updates for codegen/build hardening (toml, memchr in build)
crates/marque-ism/build.rs Real ODNI CVE/XSD parsing → generated Rust enums/tables + schema pinning assertion
crates/marque-extract/src/lib.rs Re-export ordering cleanup
crates/marque-engine/src/pipeline.rs Updates pipeline candidate type to MarkingCandidate
crates/marque-engine/src/output.rs Fix result now returns AppliedFix; fix_count requires fix.is_some()
crates/marque-engine/src/lib.rs Exposes clock + FixMode/InvalidThreshold + adds default_ruleset() helper
crates/marque-engine/src/engine.rs Adds FixMode, injected clock, fix ordering/overlap guard, threshold override API
crates/marque-engine/src/clock.rs New Clock abstraction with SystemClock and FixedClock
crates/marque-engine/src/batch.rs Batch APIs now return per-document Result via BatchError, and use new fix API
crates/marque-core/src/span.rs Re-export updates for renamed candidate type
crates/marque-core/src/scanner.rs Scanner now returns Vec<MarkingCandidate>
crates/marque-core/src/parser.rs Parser now emits typed CVE enums + updated CAB parsing + trigraph validation via try_new
crates/marque-core/src/attrs.rs Updates re-exports to match new marque-ism::attrs surface
crates/marque-config/src/lib.rs Layered config loader with FR-010/FR-011 hard-fails + severity/env/threshold validation
crates/marque-config/Cargo.toml Adds deps on marque-ism and marque-rules for schema/severity validation
crates/marque-capco/src/rules.rs Updates rules to new FixProposal/Diagnostic::new APIs + rule id changes
Cargo.lock Locks dependency graph changes from added/removed deps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread marque/src/main.rs
Comment on lines 217 to 221
"diagnostics": result.diagnostics.iter().map(|d| serde_json::json!({
"rule": d.rule.to_string(),
"severity": format!("{:?}", d.severity),
"severity": d.severity.to_string(),
"message": d.message,
"start": d.span.start,
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In print_json_diagnostics, d.message is a Box<str> (and is behind &Diagnostic from .iter()), so "message": d.message attempts to move out of a shared reference and will not compile. Use a borrowed form like d.message.as_ref() (or d.message.to_string()), and make the same adjustment anywhere else this file formats/prints diag.message or fix.replacement from a borrowed diagnostic/fix.

Copilot uses AI. Check for mistakes.
let mut applied_rules: Vec<RuleId> = Vec::new();
let mut fixes: Vec<_> = lint
.diagnostics
.iter()
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engine::fix_inner currently auto-applies any FixProposal whose confidence meets the threshold, regardless of the diagnostic's (possibly overridden) Severity. This means a rule configured as warn/error can still have its fix applied during fix, which contradicts the Severity::Fix contract and also disagrees with LintResult::fix_count() (which counts only Severity::Fix + fix.is_some()). Filter fixes by d.severity == Severity::Fix (after overrides) before collecting proposals.

Suggested change
.iter()
.iter()
.filter(|d| d.severity == Severity::Fix)

Copilot uses AI. Check for mistakes.
Comment thread marque/src/main.rs
Comment on lines +154 to +160
let result = match engine.fix_with_threshold(&source, mode, confidence) {
Ok(r) => r,
Err(e) => {
eprintln!("error: {e}");
return 65; // EX_DATAERR per contracts/cli.md
}
};
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI currently doesn't populate AppliedFix.input for audit records, even though AppliedFix has an input field intended to be set at the boundary (e.g., file path or "-" for stdin). After obtaining the FixResult, set result.applied[*].input to path.display().to_string() (and consider also wiring stdin as "-") so downstream audit NDJSON can attribute fixes to a specific input.

Suggested change
let result = match engine.fix_with_threshold(&source, mode, confidence) {
Ok(r) => r,
Err(e) => {
eprintln!("error: {e}");
return 65; // EX_DATAERR per contracts/cli.md
}
};
let mut result = match engine.fix_with_threshold(&source, mode, confidence) {
Ok(r) => r,
Err(e) => {
eprintln!("error: {e}");
return 65; // EX_DATAERR per contracts/cli.md
}
};
let input = path.display().to_string();
for applied_fix in &mut result.applied {
applied_fix.input = input.clone();
}

Copilot uses AI. Check for mistakes.
Comment on lines 226 to 230
if ctx.marking_type != MarkingType::Banner {
return vec![];
}
if attrs.declassify_on.is_none() {
return vec![];
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeclassifyInBannerRule checks attrs.declassify_on for banner candidates, but the parser only populates declassify_on/declass_exemption when parsing CAB lines (see marque-core/src/parser.rs, parse_cab). As a result, this rule will never fire for actual banner markings. Either parse declass tokens into IsmAttributes during banner/portion parsing, or change the rule to look at the field(s) that are actually populated for banner candidates.

Copilot uses AI. Check for mistakes.
Comment on lines 190 to 194
/// Load and merge configuration from standard locations.
///
/// Search order (first found wins for each layer):
/// 1. `.marque.toml` in current directory or any parent
/// 1. `.marque.toml` in current directory (committed, project policy)
/// 2. `.marque.local.toml` alongside the project config (user-specific, gitignored)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marque_config::load is documented (and the spec contract in specs/001-marque-mvp/contracts/cli.md) as discovering .marque.toml by walking upward from the current working directory, but this implementation only checks project_root/.marque.toml and never searches parent directories. Either implement the upward walk (stopping at git root/filesystem root per contract) or update the contract/docs so CLI/server behavior is consistent.

Copilot uses AI. Check for mistakes.
Comment on lines 263 to +267
fn merge_user_into(config: &mut Config, file: ConfigFile) {
config.user.classifier_id = file.user.classifier_id.or(config.user.classifier_id.take());
config.user.classification_authority =
file.user.classification_authority.or(config.user.classification_authority.take());
config.user.default_reason = file.user.default_reason.or(config.user.default_reason.take());
config.user.derived_from_default =
file.user.derived_from_default.or(config.user.derived_from_default.take());
if let Some(user) = file.user {
if user.classifier_id.is_some() {
config.user.classifier_id = user.classifier_id;
}
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .marque.local.toml layer is parsed into ConfigFile, but merge_user_into only applies the [user] section and ignores rules, corrections, capco.version, and confidence_threshold from the local file. This conflicts with the stated precedence chain (.marque.local.toml overriding .marque.toml). Either merge the non-user fields for the local layer as well, or reject them with a clear error to avoid silent no-ops.

Copilot uses AI. Check for mistakes.
// We emit the enum anyway for type-system completeness.
let sar_entries = parse_cve_xml(&cve_dir.join("CVEnumISMSAR.xml"));
if sar_entries.is_empty() {
// Emit a minimal enum with a placeholder variant so it's usable
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says emit_empty_enum produces a "placeholder variant", but emit_empty_enum actually emits an empty enum (pub enum SarIdentifier {}), which is uninhabited and has no variants. Either update the comment to reflect the uninhabited type, or emit a real placeholder/unknown variant if you need the enum to be constructible in downstream code.

Suggested change
// Emit a minimal enum with a placeholder variant so it's usable
// Emit an empty (uninhabited) enum for type-system completeness.

Copilot uses AI. Check for mistakes.
Comment thread marque/src/main.rs
Comment on lines +82 to +86
// FR-011 (schema version mismatch), and threshold/severity validation.
// These are intentional safety gates — do not silently fall back to
// `Config::default()` on error.
let config = match marque_config::load(&cwd) {
Ok(c) => c,
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marque_config::load is called with cwd here, so config resolution is limited to the current directory. The CLI contract (specs/001-marque-mvp/contracts/cli.md "Configuration discovery") describes walking upward to find .marque.toml and then looking for .marque.local.toml alongside it. Consider moving that discovery logic into marque_config (or doing it here) so the CLI matches the contract.

Copilot uses AI. Check for mistakes.
@bashandbone
Copy link
Copy Markdown
Collaborator Author

@copilot apply changes based on the comments in this thread

Let me give some guidance on how this should work functionally. The CAB declassify on date is essentially the furthest date in the future from the observed dates (if any), or typically, 25 years from the date of the material, whichever is longest (some things have shorter default declassifications, like fleeting/tactical information -- typically 10 years -- and others, like HUMINT can default to 50). If there are banner markings with declassification information, then we need a way of identifying that max for a given page or document.

Similarly, a banner marking can be thought of as a composite of the portion markings on a given page, or where 'page' doesn't apply, the materials. The banner is the most restrictive sum of its portion marking parts. So architecturally, it's important that we maintain some understanding of at least the maximums observed for a given 'page'

An example to illustrate --

(S//NF) An example paragraph.

(DEU C//REL TO USA, DEU, NATO) Another example paragraph.

The banner here would be SECRET//FGI DEU//NOFORN

If the architecture does not currently allow this kind of resolution, then now is a good time to identify it and brainstorm solutions.

bashandbone

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator Author

@bashandbone bashandbone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second-pass review — would approve (self-PR, GH blocks explicit approval)

Previous comment on this PR was corrupted by an ANSI-injecting cat alias; this is the intended clean version.

Validation: clippy clean, 45 unit + 1 doctest pass, fmt clean, mergeable. No blocking issues.

Full report: .claude/PRPs/reviews/pr-3-review.md (working tree, not committed).

Focus-area results (all clean)

  • Overlap guard correctness — walked adjacent, contained, identical, and overlapping span cases under the FR-016 reverse-end sort. fix.span.end <= previous.span.start handles all four correctly.
  • Trigraph::as_str from_utf8_unchecked soundness — no construction path outside try_new (ASCII validator) or the USA const.
  • Severity::Off gate — short-circuits before any diagnostic is produced.
  • Send/Sync — all bounds correctly declared; BatchEngine Arc<Engine> is sound.
  • -0.0 / subnormal handling in fix_with_threshold — correct.

Findings: 0 CRITICAL · 3 HIGH · 5 MEDIUM · 4 LOW

HIGH (not blocking, worth a follow-up)

  • H-1 BatchError does not distinguish panic from cancellationBatchError::TaskFailed(JoinError) wraps JoinError whole, but JoinError::is_panic() vs is_cancelled() matter for CI pipelines that need to alert on bugs vs quietly retry on cancellation. Recommend forwarding is_panic()/is_cancelled() on BatchError. (batch.rs:48–77)
  • H-2 AppliedFix::__engine_promote is pub #[doc(hidden)] — the doc explicitly admits "enforced by convention and code review, not by the type system." Any downstream crate can construct an AppliedFix bypassing the engine's confidence gate, FR-016 sort, and C-1 overlap guard, injecting arbitrary audit records. For a security tool whose audit log is the compliance output, this is a meaningful integrity gap. Full fix requires crate-boundary restructuring; recommend documenting as a known risk in CLAUDE.md under architectural invariants. (rules/lib.rs:246–261)
  • H-3 fix_with_threshold has no test for INFINITY / NEG_INFINITY rejection — the validation correctly rejects both, but the test suite only covers NaN, -0.1, 1.1. Two-line test addition. (engine.rs:146–149)

MEDIUM

  • M-1 resolve_idents has no non-empty assertion on required CVE files — an empty CVEnumISMDissem.xml would silently produce zero dissem diagnostics. (build.rs:192–209)
  • M-2 Duplicate-ident collision panic names only one of the two colliding values. (build.rs:204)
  • M-3 TRIGRAPHS slice is unsorted so is_trigraph still uses linear scan; ALL_CVE_TOKENS went through a BTreeSet for binary_search. Symmetrical cleanup. (build.rs + token_set.rs:44)
  • M-4 Threshold filter path has no test at the boundary (0.94 excluded, 0.95 included).
  • M-5 Zero-length-span filter path (which masks Phase 2 placeholder spans) has no dedicated test.

LOW

  • L-1 BatchOptions.max_concurrent_docs drives both ConcurrencyController and buffer_unordered; doc comment only mentions one.
  • L-2 merge_user_into does not treat empty-string classifier_id as absent.
  • L-3 Severity doc describes a future .max() merge design that diverges from the current last-write-wins merge_project_into.
  • L-4 Closely related to M-4: all test proposals use 1.0 confidence, so the Config-supplied threshold path is never exercised.

Recommendation

Merge as-is. Follow-up items for the Phase 3 opening commit or a small cleanup pass:

  • H-1 (add is_panic()/is_cancelled() forwarding on BatchError)
  • H-3 (two-line test for INFINITY rejection)
  • M-1, M-4, M-5 (defensive test coverage)

H-2 is a CLAUDE.md documentation update.

…-level aggregation

- Fix parse_marking_string to detect DeclassExemption codes (25X1, 50X1-HUM...)
  and declassification dates (YYYYMMDD, YYYY) in banner/portion blocks so
  DeclassifyInBannerRule (E005) can actually fire for real banner markings.
  Add 9 targeted parser unit tests.

- Add PageContext type (marque-ism) for page-level portion accumulation:
  expected_classification (max), expected_sci/sar/dissem_controls (union),
  expected_rel_to (intersection with NOFORN superseding), expected_declassify_on
  (max date), expected_declass_exemption (last-seen placeholder for Phase 3).
  6 unit tests.

- Update RuleContext (marque-rules) to carry Option<Arc<PageContext>> so
  banner/CAB rules can compare against the expected composite in Phase 3.
  Removes Copy/PartialEq/Eq derives (non-breaking: rules receive &RuleContext).

- Update Engine::lint to build PageContext in document order and pass it
  through RuleContext for banner/CAB candidates. Arc is built lazily and
  cached between consecutive non-portion candidates (invalidated on each
  new portion) to avoid O(candidates * context_size) cloning.

Agent-Logs-Url: https://github.com/marquetools/marque/sessions/c74ae0cc-c005-46b0-8501-646eeac340b6

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 11, 2026

@copilot apply changes based on the comments in this thread

Let me give some guidance on how this should work functional...

Addressed in b8789c8. Here's what was changed and how the architecture now supports the functional requirements you described:

E005 fixparse_marking_string in marque-core/src/parser.rs now recognizes declass exemption codes (25X1, 50X1-HUM, etc.) and date strings (YYYYMMDD / YYYY) in banner and portion blocks, populating attrs.declass_exemption / attrs.declassify_on so DeclassifyInBannerRule (E005) actually fires for real banner markings. 9 unit tests added.

PageContext architecture — Added marque_ism::PageContext (crates/marque-ism/src/page_context.rs) for page-level accumulation with the following aggregation rules matching CAPCO requirements:

  • expected_classification()max() (most restrictive wins)
  • expected_sci/sar/dissem_controls() → union (any control on any portion applies to the page)
  • expected_rel_to() → intersection — with NOFORN superseding REL TO entirely (your example: (S//NF) + (DEU C//REL TO USA, DEU, NATO) → banner REL TO is empty, NOFORN applies)
  • expected_declassify_on() → max date (furthest out, YYYYMMDD lexicographic ordering)
  • expected_declass_exemption() → last-seen placeholder; duration-aware ordering is a Phase 3 TODO

RuleContext now carries Option<Arc<PageContext>> (populated for banner/CAB candidates). The engine builds the context in document order, with lazy Arc caching invalidated on each new portion, so the snapshot a banner rule sees reflects all preceding portions. Phase 3 banner-vs-portions validation rules (e.g., detecting that the observed banner is less restrictive than the expected composite) can now consume this directly from ctx.page_context.

Copy link
Copy Markdown
Collaborator Author

@bashandbone bashandbone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review on b8789c8 — E005 parser fix + PageContext architecture

Pulled, built, tested locally against the new commit.

Validation: clippy clean across --all-targets, 60 unit tests + 1 doctest, zero failures. (Growth: +15 tests — marque-core 3→12 for declass parsing, marque-ism 17→23 for PageContext.) Trivial rustfmt drift on two files not blocking — fixing in a follow-up style: commit.

What landed, against the stated requirements

E005 real-rule path: parse_marking_string in marque-core/src/parser.rs now falls through to DeclassExemption::parse(block) and the new is_declass_date(block) predicate, so declass exemption codes (25X1, 50X1-HUM, …) and YYYYMMDD / YYYY free-text dates appearing inside a banner/portion marking populate attrs.declass_exemption / attrs.declassify_on. DeclassifyInBannerRule now fires on real markings instead of being a dead path. ✓

PageContext aggregation semantics (crates/marque-ism/src/page_context.rs):

Rule Requirement Implementation Verdict
expected_classification most restrictive wins .max() over Option<Classification>
expected_sci_controls union BTreeSet collect → Vec
expected_sar_identifiers union BTreeSet collect → Vec
expected_dissem_controls union BTreeSet collect → Vec
expected_rel_to intersection, NOFORN supersedes any_noforn → empty, else intersect
expected_declassify_on furthest-out date lexicographic max_by on &str ✓ (with caveat — see below)
expected_declass_exemption most specific last-seen placeholder, Phase 3 TODO ✓ (acknowledged TODO)

The (S//NF) + (DEU C//REL TO USA, DEU, NATO) example is covered by the noforn_supersedes_rel_to test: any NOFORN portion zeroes out the expected REL TO, forcing the banner to carry NOFORN instead. Matches the spec exactly.

Engine integration (crates/marque-engine/src/engine.rs): the lint loop builds PageContext in document order, accumulates every MarkingType::Portion parse, and hands banner/CAB candidates an Arc<PageContext> via RuleContext.page_context. The Arc is cached per page-snapshot and invalidated on every new portion — so successive banners in the same page share one allocation, and a banner after new portions gets a fresh snapshot. Clean.

Correctness observations

O-1 — Lexicographic date comparison is subtly correct for this encoding, worth a comment
page_context.rs:186.max_by(|a, b| a.cmp(b)) on declass date strings. This works correctly because:

  • YYYYMMDD vs YYYYMMDD → pure lex = chronological ordering
  • YYYY vs YYYYMMDD with shared prefix → shorter (YYYY) < longer (YYYYMMDD), which matches the semantic "YYYY = Jan 1 of YYYY, which is ≤ any date within that year"
  • Different YYYYs → lex order = chronological

So the ordering happens to be correct under the "YYYY means Jan 1" convention documented in is_declass_date. Worth a two-line inline comment on the max_by pinning the invariant — if someone changes the encoding to "YYYY = end of year" in the future, the lex compare silently goes wrong in the mixed case.

O-2 — PageContext never resets between pages
engine.rs:82 has the correct TODO(phase-3) noting that page-break detection is a scanner-level concern. Today the context accumulates across the entire document, so a banner on "page 2" sees portion data from "page 1". For Phase 2 single-unit markings this is correct; for paginated documents it over-restricts. Phase 3 scanner work will add page-break candidates that reset the accumulator. No action needed now.

O-3 — page_context.add_portion(parsed.attrs.clone()) clones IsmAttributes per portion
engine.rs:97IsmAttributes has Box<[T]> fields whose clone is O(n) in slice contents. For documents with hundreds of portions this adds up. A Phase 3 optimization could store Arc<IsmAttributes> in PageContext::portions so add_portion is O(1). Not a blocker — portion counts per document are typically small.

O-4 — PageContext accessor methods allocate a BTreeSet + Vec per call
expected_sci_controls() etc. recompute the union from scratch on each invocation. A Phase 3 banner-vs-portion validation rule that calls multiple expected_* helpers would re-walk the portions list each time. Memoization (e.g., compute all aggregates once after the last portion is added, stash in OnceCell) is a Phase 3 optimization, not needed today.

O-5 — expected_declass_exemption Phase 3 TODO on duration-aware ordering — already called out in the doc comment. EO 13526 §3.3(b) duration ranking (50X1-HUM > 50X1 > 25X1 etc.) needs a hand-coded comparator because the CVE-defined Ord on DeclassExemption is alphabetical on the CVE token, not semantically ordered by retention length. Correctly deferred.

Rustfmt drift

Two files have trivial single-line reformatting that cargo fmt --check catches:

  • crates/marque-core/src/parser.rs (13 insertions / some reflow)
  • crates/marque-ism/src/page_context.rs (one .into_boxed_slice() line wrap in a test fixture)

Landing in a follow-up style: commit.

Decision

APPROVE. Copilot's architecture answer is tight and matches the stated functional requirements exactly. The four observations above are Phase 3 polish / documentation, not blockers. The only code change needed on top of b8789c8 is the rustfmt drift — trivial.

Suggested small improvements (can land with the Phase 3 opener or as follow-ups):

  • O-1: two-line comment pinning the lex-compare invariant on expected_declassify_on
  • O-3: Arc<IsmAttributes> internally in PageContext::portions once portion counts grow
  • O-4: memoize expected_* aggregates in OnceCell after the last add_portion — needs a "sealed" state on PageContext
  • Phase 3 duration_aware_max for DeclassExemption (already noted in code)

Ready to merge.

Resolves 13 findings from the second-pass local review of PR #3
(3 HIGH, 5 MEDIUM, 4 LOW, 1 Observation). Zero correctness bugs
in the previous state — these are API hardening, test coverage,
and documentation fixes that were noted as "not blocking" but
worth landing before Phase 3.

HIGH
- H-1 BatchError::is_panic() / is_cancelled() forwarding from
  JoinError, plus variant-aware Display. CI callers can now
  distinguish application panics from cancellation without
  downcasting through source().
- H-2 CLAUDE.md Architectural Invariants section pins the
  contracts enforced only by convention: __engine_promote
  engine-only, FixProposal purity, hardcoded Zone/position,
  PageContext page-reset, Severity::Off non-firing.
- H-3 fix_with_threshold_rejects_infinity test pins INFINITY
  and NEG_INFINITY rejection against future refactors of the
  range+is_nan check.

MEDIUM
- M-1 build.rs assert_required helper panics at build time if
  any non-SAR CVE file produces zero entries. An empty
  CVEnumISMDissem.xml from a bad schema copy would otherwise
  silently make all dissem rules fire zero diagnostics.
- M-2 resolve_idents uses HashMap<String, String> so duplicate-
  ident collisions name both offending CVE values in the panic.
- M-3 TRIGRAPHS now emitted via BTreeSet (sorted, deduped);
  is_trigraph uses binary_search instead of O(n) linear scan.
  New trigraphs_are_sorted_and_unique test pins the invariant.
- M-4 confidence_below_default_threshold_is_excluded and
  confidence_at_default_threshold_is_included tests pin the
  >= semantics at the 0.94 / 0.95 boundary.
- M-5 zero_length_span_fix_is_filtered_before_sort test pins
  the !span.is_empty() filter that masks Phase 2 placeholder
  spans from the C-1 overlap guard.

LOW
- L-1 BatchOptions.max_concurrent_docs doc explains both roles
  it plays (ConcurrencyController semaphore AND buffer_unordered).
- L-2 merge_user_into treats empty / whitespace-only user-field
  strings from .marque.local.toml as absent, so they can't
  silently overwrite populated values from another layer.
- L-3 Severity doc rewritten to describe actual last-write-wins
  merge semantics instead of an aspirational .max() design
  that the loader never implemented.
- L-4 config_supplied_threshold_filters_proposals test uses
  Config::set_confidence_threshold(0.5) to exercise the Config
  path (all other threshold tests use the fix_with_threshold
  override).

Observation
- O-1 expected_declassify_on carries an inline comment pinning
  why lex-compare is semantically correct under the documented
  "YYYY = Jan 1" encoding, with a warning that changing the
  convention to "YYYY = Dec 31" would silently break it.

Tests: 66 unit + 1 doctest, zero failures (up from 60:
+5 engine, +1 marque-ism). Clippy clean, rustfmt clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bashandbone bashandbone merged commit 44451ab into main Apr 11, 2026
1 check passed
@bashandbone bashandbone deleted the 001-marque-mvp branch April 11, 2026 14:33
bashandbone added a commit that referenced this pull request Apr 12, 2026
Phase 3 (User Story 1, P1, MVP) lands the lint side of marque
end-to-end. Nine CAPCO rules (E001-E008, W001) emit byte-precise
diagnostics threaded through a new IsmAttributes::token_spans
sidecar. The marque check CLI is rewritten to match contracts/cli.md
verbatim, and 24 invalid + 20 valid corpus fixtures plus a 1010-line
prose precision corpus pin every rule against golden .expected.json.

Test count rises from 77 to 116 (+39), zero failures.

Architectural changes:
- Span and structure plumbing:
  marque-ism gets TokenKind/TokenSpan { kind, span, text } and
  IsmAttributes.token_spans: Box<[TokenSpan]>. The parser walks
  separator positions via match_indices and records absolute byte
  offsets for every recognized token, including TokenKind::Unknown
  for blocks the parser cannot classify. Banner-form full-word
  dissem controls (NOFORN, ORCON, IMCON) now parse via a new
  parse_dissem_full_form fallback so the CVE-only abbreviations
  (NF, OC, IMC) do not surface as Unknown.
- Scanner page breaks:
  marque-ism gets MarkingType::PageBreak. The scanner emits a
  zero-length candidate at every form-feed (\f) and at the third
  consecutive newline of a \n\n\n+ run. \n\n (normal paragraph
  break) does NOT trip the reset. The engine resets PageContext
  BEFORE attempting to parse a PageBreak candidate so a corrupt
  candidate cannot block the reset. Closes the engine.rs:80
  TODO(phase-3) marker from the PR #3 review.
- RuleContext zone/position:
  Both fields are now Option<Zone> / Option<DocumentPosition> and
  the engine passes None until a structural scanner pass can prove
  a value. The Phase-2 hardcoded Body was a silent lie to any
  future rule that read the field; Phase 3 makes the uncertainty
  type-system-enforced. Closes the engine.rs:103 TODO(phase-3)
  marker from the PR #3 review.

Phase 3 rules (marque-capco/src/rules.rs, full rewrite):
- E001 banner-abbreviation: fires on abbreviated dissem in banners.
  Span points at the literal abbreviation bytes via the parallel-
  indexed DissemControl token spans.
- E002 missing-usa-trigraph: fires when REL TO is non-empty and
  USA is missing or not first. Span points at the first trigraph.
- E003 misordered-blocks: walks token_spans, computes the CAPCO
  ordinal per block kind (Class < SCI < SAR < Dissem/RelTo), fires
  if any descending step appears. Span covers the whole marking.
  Confidence 0.6 — kept as suggestion under default 0.95 threshold.
- E004 separator-count: detects back-to-back // separator runs by
  walking adjacent Separator token spans whose end == next start.
  Replaces the Phase-2 vec![] stub. Confidence 0.99.
- E005 declassify-in-banner: fires when a banner has a declass
  exemption or date inline. Span points at the declass token.
- E006 deprecated-dissem: walks both DissemControl tokens (FOUO,
  in modern CVE) and Unknown tokens (LIMDIS, removed from CVE).
  Looks each up in find_migration; hits whose replacement is a
  known dissem control fire E006. Skips abbreviation expansions
  (NF to NOFORN) — those are owned by E001.
- E007 x-shorthand-date: walks Unknown tokens (the deprecated
  dashed forms 25X1- / 50X1- land here because the CVE has only
  the canonical 25X1 / 50X1-HUM). Migration table lookup via
  TokenSpan.text avoids the false-positive of firing on the
  canonical form.
- E008 unrecognized-token: walks Unknown tokens and skips ones
  E007 will pick up (find_migration().is_some()). FR-012: no fix
  offered.
- W001 deprecated-marking-warning: scaffold present, returns
  vec![] in Phase 3 because the seed MIGRATIONS table has no
  W001-flagged entries. Adding a flag to MigrationEntry in a
  future build.rs change starts firing diagnostics with no other
  code edits.

CLI rewrite (marque/src/main.rs + new marque/src/render.rs):
- Stdin sentinel: a path of `-` reads from stdin. No path given
  also reads from stdin. Mixed paths and `-` allowed.
- --config <PATH>: short-circuits the upward walk and uses the
  specified path's parent as the load root.
- --confidence-threshold <FLOAT>: per-call override for the fix
  confidence gate.
- --format human|json: defaults to human for TTY stdout, json
  otherwise. JSON is NDJSON conforming to contracts/diagnostic.json
  with derive(Serialize) structs (no extra fields).
- --no-color: honors NO_COLOR env var (any non-empty value) and
  TERM=dumb. Suppresses ANSI in human format.
- -q / --quiet: suppresses non-diagnostic stderr narration.
- --explain-config: dumps the merged Configuration as pretty JSON,
  exits 0. Mutually exclusive with input paths and with the `fix`
  subcommand. classifier_id_present is a boolean — the value
  itself is NEVER emitted, even when MARQUE_CLASSIFIER_ID is set.
- Exit codes: 0 clean, 1 errors (Severity::Error or Severity::Fix),
  2 warnings only, 64 EX_USAGE, 65 EX_DATAERR, 74 EX_IOERR.
  Severity::Fix maps to exit 1 because a Fix-severity diagnostic
  is still a violation that should block CI.

Corpus (24 invalid + 20 valid + 1 prose):
- 3 fixtures per rule for E001-E008 with hand-computed byte spans
  validated by running each fixture through the integration test.
- W001 fixtures deferred per the plan (the rule cannot fire on
  real corpus until a W001-flagged migration entry lands).
- 20 known-good fixtures spanning portion, banner, and CAB types.
- tests/corpus/prose/article.txt: 1010 lines of public-domain
  Federalist No. 10 prose with 22 incidental (S) / (a) / (i)
  mid-sentence parens. The lint is byte-clean against this corpus
  — the SC-003a precision gate is satisfied.

Tests added (39 new):
- crates/marque-core/src/scanner.rs::tests — 4 page-break tests
- crates/marque-core/src/parser.rs::tests — 5 token-span tests
- crates/marque-engine/src/engine.rs::tests — multi-page document
- crates/marque-capco/src/rules.rs::tests — 13 per-rule tests
  with byte-span assertions
- crates/marque-capco/tests/rules_us1.rs — corpus harness driving
  every fixture through the parser+rules with .expected.json
  match (drives parser/scanner directly to avoid a circular
  marque-engine dev-dep)
- crates/marque-engine/tests/lint_pipeline.rs — 13 happy-path /
  edge-case tests + 2 insta JSON snapshots pinning the
  contracts/diagnostic.json shape against drift
- marque/tests/cli_smoke.rs — 9 CLI smoke tests via assert_cmd
  covering stdin sentinel, --explain-config no-leak, NO_COLOR,
  exit codes, and mutual exclusion

PR #3 review TODO(phase-3) markers closed:
- engine.rs:80 — PageContext page-break reset (Tasks 1, 2)
- engine.rs:103 — RuleContext.zone/position no longer hardcoded
- marque-capco/src/rules.rs Span::new(0, 0) placeholders — gone
- E004 stub vec![] — replaced with real implementation
- page_context.rs:50 Phase-3 wiring TODO — closed

CLAUDE.md "Architectural Invariants" section updated to reflect
both Phase-3 closures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfix(sar): address remaining PR review comments�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mThree follow-ups to copilot's partial fix in cfcce23:�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231m1. Full-form SAR compartments (Copilot review #1, partially unaddressed):�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231m   Unifies abbrev and full-form program parsing via a single�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231m   `parse_sar_program(chunk, base, indicator, spans)`. Full-form�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231m   program identifier now allows uppercase ASCII letters and spaces�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231m   only (no digits, no hyphens — the first `-` is still the�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231m   program/compartment boundary). Grammar per §H.5 p100. Adds tests�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m �[38;5;231m   for `SPECIAL ACCESS REQUIRED-BUTTER POPCORN-J12 J54` and rejects�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231m   digit-shaped nicknames like `SPECIAL ACCESS REQUIRED-123`. E030�[0m
�[38;5;238m  13�[0m �[38;5;238m│�[0m �[38;5;231m   coalescing now preserves compartments when collapsing repeated�[0m
�[38;5;238m  14�[0m �[38;5;238m│�[0m �[38;5;231m   full-form blocks.�[0m
�[38;5;238m  15�[0m �[38;5;238m│�[0m
�[38;5;238m  16�[0m �[38;5;238m│�[0m �[38;5;231m2. E030 fix span (Copilot review #2):�[0m
�[38;5;238m  17�[0m �[38;5;238m│�[0m �[38;5;231m   Replace `tok.span.start.saturating_sub(2)` with a backward walk�[0m
�[38;5;238m  18�[0m �[38;5;238m│�[0m �[38;5;231m   over `attrs.token_spans` to the nearest preceding Separator�[0m
�[38;5;238m  19�[0m �[38;5;238m│�[0m �[38;5;231m   token. The parser trims per-block whitespace, so the Unknown�[0m
�[38;5;238m  20�[0m �[38;5;238m│�[0m �[38;5;231m   token's span is not flush against the `//`; anchoring on the�[0m
�[38;5;238m  21�[0m �[38;5;238m│�[0m �[38;5;231m   actual Separator span keeps the fix correct when source has�[0m
�[38;5;238m  22�[0m �[38;5;238m│�[0m �[38;5;231m   whitespace after the separator (`SECRET// SAR-CD`). `original`�[0m
�[38;5;238m  23�[0m �[38;5;238m│�[0m �[38;5;231m   is reconstructed from the separator text plus the gap so splicing�[0m
�[38;5;238m  24�[0m �[38;5;238m│�[0m �[38;5;231m   remains byte-accurate.�[0m
�[38;5;238m  25�[0m �[38;5;238m│�[0m
�[38;5;238m  26�[0m �[38;5;238m│�[0m �[38;5;231m3. Dead code in parser (Copilot review #3):�[0m
�[38;5;238m  27�[0m �[38;5;238m│�[0m �[38;5;231m   Remove `let _ = sar_captured;` — the variable is already read in�[0m
�[38;5;238m  28�[0m �[38;5;238m│�[0m �[38;5;231m   the capture-gating branch; the silencer was stale.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(ism): add SCI structural data model (P1 of #3)�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mIntroduce SciMarking / SciControlSystem / SciCompartment and generate�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231mSciControlBare from the live CVE at build time. Adds sci_markings�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231malongside the existing sci_controls projection for back-compat.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(core): SCI structural subparser (P2 of #3)�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mAdds parse_sci_block dispatched before the existing CVE exact-match�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231mpath. Produces SciMarking entries for compartments and sub-compartments�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231mper CAPCO-2016 §A.6, while preserving the sci_controls enum projection�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231mfor back-compat with E010/E011 and existing token_spans consumers.�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231mThe dispatch is gated on `contains('-') || contains('/') ||�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231mis_bare_cve_value` so plain two-letter tokens (NF, RD, etc.) continue�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231mto fall through to the dissem/non-IC/SAR/AEA chain. Custom controls�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m �[38;5;231madditionally reject any text that parses as another known category so�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231mmixed-category slash blocks like `SI/NF` still surface as Unknown for�[0m
�[38;5;238m  13�[0m �[38;5;238m│�[0m �[38;5;231mE004.�[0m
�[38;5;238m  14�[0m �[38;5;238m│�[0m
�[38;5;238m  15�[0m �[38;5;238m│�[0m �[38;5;231mcanonical_enum is populated (a) when the bare control itself matches a�[0m
�[38;5;238m  16�[0m �[38;5;238m│�[0m �[38;5;231mCVE value — preserves existing behaviour for `SI`, `TK`, `HCS` — and�[0m
�[38;5;238m  17�[0m �[38;5;238m│�[0m �[38;5;231m(b) when `{ctrl}-{first_comp}` matches a CVE value AND the first�[0m
�[38;5;238m  18�[0m �[38;5;238m│�[0m �[38;5;231mcompartment has no sub-compartments, per data-model §canonical_enum.�[0m
�[38;5;238m  19�[0m �[38;5;238m│�[0m
�[38;5;238m  20�[0m �[38;5;238m│�[0m �[38;5;231mAlso re-exports `marque_ism::is_bare_cve_value` from the generated CVE�[0m
�[38;5;238m  21�[0m �[38;5;238m│�[0m �[38;5;231mmodule so the subparser can gate on the live vocabulary rather than a�[0m
�[38;5;238m  22�[0m �[38;5;238m│�[0m �[38;5;231mhardcoded list.�[0m
�[38;5;238m  23�[0m �[38;5;238m│�[0m
�[38;5;238m  24�[0m �[38;5;238m│�[0m �[38;5;231mTests: 12 new parser tests covering bare, compound, multi-system,�[0m
�[38;5;238m  25�[0m �[38;5;238m│�[0m �[38;5;231msub-compartments, the §A.6 p16 canonical example, custom controls,�[0m
�[38;5;238m  26�[0m �[38;5;238m│�[0m �[38;5;231mrejection cases, and the SI/NF fall-through invariant. All 75 marque-�[0m
�[38;5;238m  27�[0m �[38;5;238m│�[0m �[38;5;231mcore tests + 94 marque-capco tests pass. Workspace test summary:�[0m
�[38;5;238m  28�[0m �[38;5;238m│�[0m �[38;5;231m411 passed, 1 failed (pre-existing SC-006 README issue in�[0m
�[38;5;238m  29�[0m �[38;5;238m│�[0m �[38;5;231mmarque-config, unrelated).�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(ism): SCI banner roll-up via PageContext (P4 of #3)�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mPageContext::expected_sci_markings() unions compartments and�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231msub-compartments across portions, sorted per CAPCO-2016 §A.6 p15�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231m(numeric first, alpha after). render_expected_banner() uses the�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231mstructural form for SCI when markings are present.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(capco): SCI compartment-aware rules (P3 of #3)�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mAdds E032 sci-system-order, E033 sci-compartment-order,�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231mE034 sci-custom-control-info, E035 sci-banner-rollup.�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231mAudits E010 bare-hcs to detect structural HCS-with-no-compartments.�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231mExtends E008 skip filter so structurally-formed SCI blocks are�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231mnot flagged as Unknown.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 15, 2026
CLAUDE.md SCI subsection + rule count bump (29 -> 33).
README.md feature mention. marque-ism migration note.
bashandbone added a commit that referenced this pull request Apr 15, 2026
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mtest(corpus): SCI compartment fixtures (P5 of #3)�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231mCanonical §A.6 p16 example, bare single-sub, multi-sub,�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231mcustom-control, system-order violation, compartment-order�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231mviolation, and sub-compartment-order violation. Verifies�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231mstructural parser + E032/E033.�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231mValid fixtures (valid/):�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231m  sci_bare_single_sub        SECRET//SI-G ABCD//NOFORN�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231m  sci_multi_subs             SECRET//HCS-P INTEL OPS//NOFORN�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231mInvalid fixtures (invalid/):�[0m
�[38;5;238m  13�[0m �[38;5;238m│�[0m �[38;5;231m  sci_canonical_subcompartments  §A.6 p16 canonical (E034)�[0m
�[38;5;238m  14�[0m �[38;5;238m│�[0m �[38;5;231m  sci_custom_control_info        99/SI custom control   (E034)�[0m
�[38;5;238m  15�[0m �[38;5;238m│�[0m �[38;5;231m  sci_system_order               SI/123                 (E032,E034)�[0m
�[38;5;238m  16�[0m �[38;5;238m│�[0m �[38;5;231m  sci_compartment_order          SI-NK-G                (E033)�[0m
�[38;5;238m  17�[0m �[38;5;238m│�[0m �[38;5;231m  sci_subcompartment_order       SI-G DEFG ABCD         (E033)�[0m
�[38;5;238m  18�[0m �[38;5;238m│�[0m
�[38;5;238m  19�[0m �[38;5;238m│�[0m �[38;5;231mHarness wiring:�[0m
�[38;5;238m  20�[0m �[38;5;238m│�[0m �[38;5;231m- marque-engine corpus_accuracy skips E034-expected diagnostics:�[0m
�[38;5;238m  21�[0m �[38;5;238m│�[0m �[38;5;231m  E034 ships Severity::Off so the engine loop correctly does not�[0m
�[38;5;238m  22�[0m �[38;5;238m│�[0m �[38;5;231m  fire it; rules_us1 exercises E034 by bypassing severity gating.�[0m
�[38;5;238m  23�[0m �[38;5;238m│�[0m �[38;5;231m- marque corpus_provenance treats structural SCI tokens�[0m
�[38;5;238m  24�[0m �[38;5;238m│�[0m �[38;5;231m  (`SI-G ABCD`, `HCS-P INTEL OPS`) as in-vocabulary: they're�[0m
�[38;5;238m  25�[0m �[38;5;238m│�[0m �[38;5;231m  parsed structurally rather than CVE-matched.�[0m
�[38;5;238m  26�[0m �[38;5;238m│�[0m
�[38;5;238m  27�[0m �[38;5;238m│�[0m �[38;5;231mDeviations / follow-up:�[0m
�[38;5;238m  28�[0m �[38;5;238m│�[0m �[38;5;231m- E035 banner-rollup fixture intentionally omitted. P4 delivered�[0m
�[38;5;238m  29�[0m �[38;5;238m│�[0m �[38;5;231m  `PageContext::expected_sci_markings() -> Box<[SciMarking]>` but�[0m
�[38;5;238m  30�[0m �[38;5;238m│�[0m �[38;5;231m  the rule's defensive fallback trait in marque-capco still�[0m
�[38;5;238m  31�[0m �[38;5;238m│�[0m �[38;5;231m  declares `fn expected_sci_markings(&self) -> Vec<SciMarking>`,�[0m
�[38;5;238m  32�[0m �[38;5;238m│�[0m �[38;5;231m  so the trait default (empty Vec) wins via method resolution.�[0m
�[38;5;238m  33�[0m �[38;5;238m│�[0m �[38;5;231m  Result: E035 is inert in production. The author intended the�[0m
�[38;5;238m  34�[0m �[38;5;238m│�[0m �[38;5;231m  method-resolution probe pattern to pick up the inherent method�[0m
�[38;5;238m  35�[0m �[38;5;238m│�[0m �[38;5;231m  when P4 landed, but the return-type mismatch blocks that.�[0m
�[38;5;238m  36�[0m �[38;5;238m│�[0m �[38;5;231m  Fix belongs in a separate commit (align the fallback signature�[0m
�[38;5;238m  37�[0m �[38;5;238m│�[0m �[38;5;231m  to `Box<[SciMarking]>`); a corresponding invalid fixture should�[0m
�[38;5;238m  38�[0m �[38;5;238m│�[0m �[38;5;231m  be added alongside that fix.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
bashandbone added a commit that referenced this pull request Apr 25, 2026
Six comments addressed:

1. `resolve_request_deadline` doc comment said it returned
   `Ok(Some)/Ok(None)`, but the function returns
   `Result<Duration, StatusCode>`. Rewritten to match the actual
   API contract and document each branch.

2. Empty header value treated as parse error (400). Spec/contract
   says empty should fall back to default. Now `s.trim().is_empty()`
   short-circuits to the per-endpoint default — same convention as
   absent. Catches the "template substitution emitted nothing"
   case where a caller's tooling produces `X-Marque-Deadline: `
   without intending to set a budget.

3+4. Both `lint_handler` and `fix_handler` validated the deadline
   header AFTER `serde_json::from_slice`, so a request with a
   malformed body AND a bad header would surface as 422 instead
   of the intended 400. Reordered: corpus-override checks (header
   + query) → deadline header validation → body deserialization
   → body-field corpus-override check → engine call. The
   `Instant::now()` stamp stays just before the engine call so
   JSON parse time isn't billed against the caller's budget.

5. `deadline_header_above_cap_returns_400` claimed to cover the
   default cap (60 000 ms) but built its header value from
   `MAX_DEADLINE_CAP_MS` (600 000 ms — the hard ceiling, a
   different invariant). Switched to `DEFAULT_DEADLINE_CAP_MS + 1`
   so the boundary is pinned correctly. Comment updated to call
   out the cap vs. ceiling distinction.

6. Added three tests: `deadline_header_empty_uses_endpoint_default`,
   `deadline_header_whitespace_uses_endpoint_default`, and
   `deadline_header_validated_before_body_deserialization` (the
   ordering-invariant pin for #3/#4 — bad-header + bad-body must
   produce 400, not 422).

Tests: 55/55 server pass (was 52, +3 new). Workspace 1242/1242
nextest pass; cargo fmt/clippy clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bashandbone added a commit that referenced this pull request May 15, 2026
Three follow-up Copilot findings on the previous citation fixup:

1. `crates/capco/src/scheme.rs:4202` — the `FDR_DOMINATORS` header
   comment still called §B.3 Table 2 the authoritative enumeration
   AFTER my new maintenance note correctly cited §B.3.a p19.
   Updated the header to cite §B.3.a p19 (canonical) + §B.3 Table 2
   pp 21-22 (derivative summary table) so the file no longer
   contradicts itself.

2. `crates/scheme/src/vocabulary.rs:343` — trait-level doc bullet
   #3 described `FDR_DOMINATORS` as the static slice
   `is_fdr_dominator` uses. After the maintenance contract
   reword, this is misleading. Rewrote bullet #3 to make the
   distinction explicit: `is_fdr_dissem` answers FD&R-set
   membership (admits RELIDO); `is_fdr_dominator` answers
   "dominator over RELIDO" for the conflict-family role
   (excludes RELIDO). Added the same "do not delegate" guidance
   the in-tree maintenance contract carries.

3. `crates/capco/src/vocabulary.rs:1024` — overstated
   `AnyInCategory` admission semantics. The override receives a
   single `TokenId` and routes through `capco_token_category` to
   hit category arms; adding a new `AnyInCategory(CAT_NEW)` to
   `FDR_DOMINATORS` only admits tokens that
   `capco_token_category` routes to `CAT_NEW`. Reworded with
   the explicit two-path dual-update rule (Token entries auto;
   AnyInCategory entries require a routing arm).

All workspace tests pass; clippy clean. No behavioral changes.
bashandbone added a commit that referenced this pull request May 18, 2026
…site (Copilot R2 suppressed #2 + inverse case)

PR 4b-D.2 commit 13. Items 1, 2, 3 from Copilot R2. The D22 fix
landed in PR 4b-D.2 commit 3 routed NOFORN FactAdd through
DissemSet's token-axis supersession overlay. Copilot R2 surfaced
two completion gaps the original commit missed:

(a) Country-list axes stayed populated. `apply_fact_add` evicted
    `Rel` / `Relido` / `Displayonly` / `Eyes` TOKENS from
    `dissem_us` but did NOT clear the parallel
    `attrs.rel_to: Box<[CountryCode]>` /
    `attrs.display_only_to: Box<[CountryCode]>` country-list
    fields. The closure / PageRewrite paths cleared them via the
    `capco/noforn-clears-rel-to` / `capco/noforn-clears-display-only-to`
    rewrites; but direct `apply_intent` callers (E021 AEA → NOFORN,
    E038 NODIS/EXDIS → NOFORN) bypass `scheme.project`'s
    PageRewrite loop and got a marking with NOFORN + populated
    country lists — invalid per §H.8 p145.

(b) Inverse case: FactAdd of `RELIDO` / `DISPLAY ONLY` / `EYES`
    onto a marking with `dissem_us = [Nf]` was APPENDING the
    dominated token. Result: `dissem_us = [Nf, Relido]` — same
    §H.8 p145 violation as (a), opposite direction. The existing
    double-NOFORN-insertion guard didn't fire because target ≠ Nf;
    the supersession check needed an explicit inverse-case branch.

Fix in `crates/capco/src/scheme/actions/intent.rs::apply_fact_add`
CAT_DISSEM branch:

1. NEW: after `with_noforn_injected` rebuilds `dissem_us`,
   unconditionally clear `attrs.rel_to` and `attrs.display_only_to`
   if they're non-empty. Brings the full §H.8 p145 supersession
   to the injection site, not just the token-axis half.

2. NEW: at the start of the dissem branch, if `target` is one of
   `Rel` / `Relido` / `Displayonly` / `Eyes` AND `attrs.dissem_us`
   already contains `Nf`, return `IntentInapplicable` — the
   caller's FactAdd is dominated by existing state. Matches the
   existing idempotency guard for double-NOFORN insertion (no
   mutation, audit log doesn't see an applied no-op).

The PageRewrite layer (`capco/noforn-clears-*`) remains as
defense-in-depth. With Item 1 in place, the PageRewrites are
defensive-redundant on the production path; both layers converge
to the same correct §H.8 p145 output.

Authority: §H.8 p145 ("NOFORN ... Cannot be used with REL TO,
RELIDO, EYES ONLY, or DISPLAY ONLY") + §D.2 Table 3 rows 1-2
+ §H.8 p157 (EYES ONLY: NSA-only, retains `DissemControl::Eyes`
through lint). Re-verified 2026-05-18 against
`crates/capco/docs/CAPCO-2016.md`.

## Item 2 — fix the broken display_only_to integration test

Copilot R2 #2 was right: the previous
`noforn_clears_display_only_to_via_cross_portion_join` test didn't
exercise the new `capco/noforn-clears-display-only-to` rewrite.
`PageContext::expected_display_only` short-circuits to empty
whenever ANY portion has NOFORN (page_context.rs:881-896); with
portion 1 NOFORN-bearing, `out.display_only_to` was empty at join
time BEFORE the rewrite ran. The test passed regardless of whether
the rewrite existed.

Renamed to `noforn_clears_display_only_to_via_ucni_promote` and
pivoted the fixture to:
- Portion 1: classified DOD UCNI + DISPLAY ONLY USA, GBR (the
  UCNI carries the §H.6 p116 strip-and-promote trigger; the
  DISPLAY ONLY ensures portion 1 passes the row-19 all-or-nothing
  gate in `expected_display_only`).
- Portion 2: classified DISPLAY ONLY USA (also contributes
  display-permission so the gate doesn't short-circuit).

Verification by toggle:
- WITH `apply_fact_add` clearing AND the PageRewrite: test passes.
- WITHOUT `apply_fact_add` clearing (Item 1 disabled), WITH the
  PageRewrite: test still passes — PageRewrite catches the
  violation.
- WITHOUT the PageRewrite (Item 1 enabled, rewrite deleted): test
  passes — `apply_fact_add` catches it.
- Hypothetical: WITHOUT BOTH: test fails. Both layers are
  load-bearing for their respective injection paths.

The integration test now asserts the §H.8 p145 banner invariant
holds through the full `scheme.project` pipeline regardless of
which layer cleared. Defense-in-depth coverage documented inline.

## Item 3 — replace misleading example in noforn_clears.rs

Copilot R2 #3 was right: `CLOSURE_NOFORN_SAR` cannot be the
load-bearing trigger for the `noforn-clears-display-only-to`
rewrite — DISPLAY ONLY is in `FDR_DOMINATORS` (closure.rs:95-109)
which SUPPRESSES all 7 NOFORN-implies closure rules on any portion
that carries DISPLAY ONLY. The realistic trigger is a Pattern-A/C
PageRewrite (UCNI promote, NODIS-implies-NF) injecting NOFORN
post-join.

Rewrote both occurrences (line 65 + line 191 in noforn_clears.rs)
to cite the Pattern-C UCNI scenario and explain why closure rules
can't be the load-bearing path on this axis. Also documented the
post-R2-#1 defense-in-depth framing: `apply_fact_add` is now
self-sufficient; the rewrite covers future-refactor cases that
bypass it.

## D22 amendment

`decisions.md` D22 entry expanded to record the post-R2 completed
scope: (a) token-axis eviction (PR 4b-D.2 commit 3); (b)
country-axis clearing (Copilot R2 #1 commit 13); (c) inverse-case
rejection (Copilot R2 #1 commit 13). All three axes now maintained
at the apply_fact_add injection site, with the PageRewrite layer
as defense-in-depth.

## Adjacency-walk findings

Searched `FactAdd.*TOK_NOFORN|FactRef::Cve(TOK_NOFORN` across
crates/ — production callers:
- `crates/capco/src/rules_declarative.rs:1096` (E021 AEA add-NF)
- `crates/capco/src/rules_declarative.rs:1504` (E038-style add-NF)
- `crates/capco/src/scheme/rewrites/pattern_c.rs:219,269` (UCNI promotes)
- `crates/capco/src/scheme/rewrites/pattern_a.rs:146,210,289,369`
  (nodis/exdis/sbu-nf/les-nf-implies-noforn)

All 8 production NOFORN FactAdd sites now benefit from
self-sufficient apply_fact_add clearing. The two E021/E038-style
rules in rules_declarative.rs are the direct-caller paths Item 1's
country-axis clearing specifically targets; the 6 Pattern-A/C
rewrite paths go through `scheme.project`'s PageRewrite loop
already, but Item 1 makes their apply_fact_add invocation
self-sufficient too (defense-in-depth at the lower layer).

Searched `FactAdd.*TOK_RELIDO|FactAdd.*TOK_DISPLAY_ONLY|FactAdd.*TOK_EYES`
— ZERO production sites currently FactAdd dominated tokens
directly. The inverse-case rejection is correctness-by-construction
for future rule emissions.

## Verification gates

  cargo +stable fmt --all --check       : clean
  cargo +stable build --workspace       : clean
  cargo +stable clippy --workspace ...  : clean
  cargo +stable test --workspace        : 0 failures

Specific test runs:
- `category_action_intent::apply_fact_add_*` (6 new tests): all pass
- `closure_hotpath::noforn_clears_display_only_to_*` (2 tests,
  rewrote one + kept one): all pass

Tracking issue #528 (per-axis lattice Eq audit) remains the
follow-up for the systematic `DissemSet`/`JointSet`/`SupersessionSet`
review per PR #456's observational-state framing.
bashandbone added a commit that referenced this pull request May 18, 2026
…dMarking) (#535)

fix #1 — the doc-comment introduced in 2aad434 made an
fix #4 — the test comment at projected::tests::
fixes #2, #3, and #6 — three independent prose-only updates
Fix #2 — S007 top-level doc-block (`BareNatoRequiresRelToRule`),
Fix #3 — `CapcoRuleSet::new()` registration comment. The
Fix #6 — S005 retention annotation. The retention note on
fix #5 — the migration trip-wire comments in
bashandbone added a commit that referenced this pull request May 18, 2026
…, #524 Phase 3)

Addresses Copilot's HIGH finding on PR #544: the non-FD&R suppressors
in `RELIDO_US_CLASS_SUPPRESSORS` (CAT_DISSEM, CAT_NON_IC_DISSEM,
CAT_AEA, CAT_SCI, CAT_SAR, CAT_NON_US_CLASSIFICATION, plus
TOK_US_UNCLASSIFIED) violated the `MarkingScheme::closure`
monotonicity contract (`crates/scheme/src/scheme.rs:476-483`).
Concrete violation:

  - `x = (S)`: rule fires → `Cl(x).dissem_us = {RELIDO}`
  - `y = (S, FOUO)`: CAT_DISSEM matches → rule suppressed →
    `Cl(y).dissem_us = {FOUO}` (FOUO ⊄ supersession-related to RELIDO)
  - `{RELIDO} ⊄ {FOUO}` ⇒ `Cl(x) ⊑ Cl(y)` violated.

## Redesign

- **New sentinel `TOK_US_COLLATERAL_CLASSIFIED` (TokenId 157,
  replacing the dropped `TOK_US_UNCLASSIFIED`)**: fires on US
  classification at or above Restricted level
  (`attrs.us_classification().is_some_and(|l| l != Unclassified)`).
  Upward-closed in the lattice order ⇒ adding facts to a
  collateral-classified marking can't make this predicate stop
  firing. Replaces the `TOK_US_CLASSIFIED` trigger; gates the §H.8
  p154 Unclassified carve-out at the trigger level instead of via
  a suppressor.

- **`RELIDO_US_CLASS_SUPPRESSORS` reduced to monotone-only entries**:
  FD&R dominators (NOFORN / RELIDO / DISPLAY_ONLY / CAT_REL_TO /
  EYES — §D.2 Table 3 pp.28-30 + §H.8 p145) plus the 6
  per-compartment SCI sentinels (TOK_SI_G / TOK_HCS_O /
  TOK_HCS_P_SUB / TOK_TK_BLFH / TOK_TK_IDIT / TOK_TK_KAND). Each
  entry satisfies the §4.7.3 case-2 monotonicity property either
  directly (FD&R dominators supersede RELIDO via the precedence
  overlay) or by closure composition (SCI compartments add NOFORN
  via per-marking row → CAVEATED → §H.8 p145 supersession, so the
  composite Cl(y) contains NOFORN ⊐ RELIDO).

- **"No other dissem" via composition**, not anti-monotone
  suppressors: any caveat marker triggers `CLOSURE_NOFORN_CAVEATED`
  (Trio 1) which injects NOFORN; NOFORN then supersedes the RELIDO
  cone via `with_noforn_injected`. FOUO (which is NOT in CAVEATED's
  trigger list) is handled by the §H.8 p134 `classification-evicts-fouo`
  PageRewrite — FOUO is stripped in classified context, so the
  closure's bare-RELIDO output is the correct final state per CAPCO
  after rewrites compose.

## Citation corrections (Copilot #2, #3)

- `wasm/tests/native_parity.rs::compute_banner_single_secret_portion`
  comment: added §B.3 Table 2 p21 alongside §H.8 p154 (Copilot #2).
- `lattice_vs_scheme_parity.rs` hoisted-rationale source 2: now
  cites §B.3 Table 2 p21 as primary authority with §H.8 p154 as
  grammar reference (Copilot #3).

## Suppressed comment (Copilot #4)

The stale `CLOSURE_RELIDO_FOUO` reference in
`us_class_suppression_fixture`'s CAT_DISSEM arm — flagged as a
low-confidence suppressed Copilot comment — was removed alongside
the CAT_DISSEM suppressor itself in this redesign.

## New monotonicity sanity pins

Two new tests in `phase3_closure_pin` cover the specific scenarios
Copilot flagged:

- `us_class_monotone_under_added_fouo` — `x = (S) ⊑ y = (S, FOUO)`;
  asserts Cl(x) and Cl(y) both contain RELIDO (subset monotonicity).
- `us_class_monotone_under_added_si_g` — `x = (S) ⊑ y = (S, SI-G)`;
  asserts Cl(y) contains NOFORN (via composition) and NOT RELIDO
  (NOFORN supersedes RELIDO via §H.8 p145; lattice-order monotonicity).

The marque-scheme `proptest_closure` harness covers the general
synthetic-scheme monotonicity property (passes for this redesign).

## Test fixture updates

- `us_class_suppression_fixture` reduced to the 11-entry suppressor
  matrix (5 FD&R dominators + 6 SCI per-compartment). All non-FD&R
  category arms (CAT_DISSEM, CAT_NON_IC_DISSEM, CAT_AEA, CAT_SCI,
  CAT_SAR, CAT_NON_US_CLASSIFICATION) + TOK_US_UNCLASSIFIED arm
  removed.
- Source-of-truth pin's documented carve-out list reduced to just
  TOK_RELIDO (the cone-itself case); CAT_NON_US_CLASSIFICATION
  carve-out gone (no longer in slice).
- Conflict-variant pin doc-comment updated to reference
  TOK_US_COLLATERAL_CLASSIFIED, not TOK_US_CLASSIFIED.
- Module-level docstring updated to remove TOK_US_UNCLASSIFIED
  references and add §D.2 Table 3 pp.28-30 citation.
bashandbone added a commit that referenced this pull request May 19, 2026
…s + baseline correction)

Copilot R1 ran on commit 32c33bb (the original R1 fix-ups) and
surfaced 5 substantive findings, plus on verification a 6th issue
emerged — the original R1 commit's "pre-4b baseline = 15"
correction was itself wrong by walked-adjacency-omission.

All 6 issues resolved. Ground truth confirmed by walking
`git show 5fd5a33:crates/capco/src/scheme/rewrites/*.rs`
(the first commit that split rewrites out of monolithic scheme.rs)
+ counting PageRewrite-row literals in each file at HEAD.

1. **Pin test (Constraint::Custom dedup masking, Copilot R1 #1)** —
   The test at `post_pr_4b_declares_exact_39_custom_constraints`
   collected into a BTreeSet before counting; a duplicate
   `Constraint::Custom("capco/foo", ...)` row would dedupe in the
   set and the size-only assertion would still pass. Fix: collect
   into a Vec first, then compare `raw_count == set.len()` —
   triple-pin (raw count + set size + raw_count_equals_set_size).
   The equality assertion is the load-bearing dedup check.

2. **CI hollow-coverage (Copilot R1 #2)** — `corpus_parity.rs` is
   currently `#![cfg(any())]` (disabled by PR 3c.B Commit 10
   pending FixProposal-shape rewrite). The `pr-4b-corpus-regression`
   job mirroring PR 3b's T029 inherited the same dormancy hole.
   Fix: add three explicit `cargo test --test` invocations for
   `post_4b_lattice_inventory_pin`, `lattice_static_assertions`,
   and `post_3b_registration_pin`. With these four active pins
   plus the still-ACTIVE `corpus_accuracy` and `corpus_provenance`
   suites, the job has real 4b-specific coverage even while
   corpus_parity stays dormant. Inline comments clarify which
   suites are ACTIVE vs dormant so a future reader who sees the
   `corpus_parity` invocation passing doesn't conclude it gives
   real coverage.

3. **Attestation `DeclassExemptionAccumulator` citation
   (Copilot R1 #3)** — Walked-adjacency miss from the R1 fix-up
   on `DeclassifyOnLattice`. Same root cause: §H.6 grounds the
   AEA *exception*, not the general declassification-exemption
   hierarchy rule. Fix: §H.6 → §E.3 pp32-33 (Multiple Sources)
   + §E.1 p31 (exemption catalog 25X#/50X#/75X#), matching the
   doc-comment §-authority block at the type definition.

4. **Attestation `Pre-4b baseline` cell (Copilot R1 #4)** —
   The table at "Per-axis net-delta math" had "~14 (pre-Pattern-B/C)"
   with the tilde flagging an unverified approximate. Fix:
   exact baseline = 14 (4 pattern_a + 2 noforn_clears + 8
   transmutation_stubs). The 3rd noforn_clears row
   (`noforn-clears-display-only-to`) was added by PR 4b-D.2
   (#527), not pre-4b.

5. **Pin doc-comment derivation chain (Copilot R1 #5)** —
   The header doc-comment said "Pattern C 8 rows" in the 4b-C
   delta, but 4b-C landed only 7 Pattern-C rows; the 8th
   (`sbu-nf-evicted-by-classified`) was added later by #541 in
   the 4b-F window. Fix: change "Pattern C 8 rows" → "Pattern C
   7 rows" in the 4b-C section; add the #541 +1 entry separately
   in the 4b-F section; document the 4b-D.2 +1 PageRewrite
   contribution that the original doc-comment also omitted.

6. **R1 baseline correction (walked-adjacency from #4)** —
   The original R1 fix-up commit (32c33bb) changed
   `CLAUDE.md:283` from "14 → 27" to "15 → 27" based on the
   architect plan's "15 = 4 pattern_a + 3 noforn_clears + 8
   transmutation_stubs" claim. The architect's count of
   3 noforn_clears rows was off-by-one: pre-4b had only 2
   (`noforn-clears-rel-to` + `noforn-clears-fdr-family`); the
   3rd row (`noforn-clears-display-only-to`) was added by
   4b-D.2 (#527). Walked the git history at `5fd5a339` (Stage
   2 PR A sub-split of `scheme/rewrites/`) and counted the
   literal rows: pattern_a 4 / noforn_clears 2 /
   transmutation_stubs 8 = 14 pre-4b. The original 4b-C
   landing CLAUDE.md entry's "catalog row count 14 → 23" is
   ground truth; my R1 fix-up over-corrected to 15 → 27 and
   needs the inverse correction back to 14 → 27 with the
   full per-sub-PR breakdown:

       Pre-4b:    14  (4 pattern_a + 2 noforn_clears + 8 transmutation_stubs)
       4b-C +9:   23  (Pattern-B 2 + Pattern-C 7)
       4b-D.2 +1: 24  (noforn-clears-display-only-to per §H.8 p145)
       #541 +1:   25  (sbu-nf-evicted-by-classified Pattern-C row 2b per §H.9 p178)
       #552 +1:   26  (sbu-nf-supersedes-sbu per §H.9 p178)
       #555 +1:   27  (les-nf-supersedes-les per §H.9 p185)

   13 new rows over the 4b umbrella + post-4b-F window.
   14 + 13 = 27 ✓ (matches the test's hardcoded expected count).

Verification gate (all GREEN):
- `cargo +stable test -p marque-capco --test post_4b_lattice_inventory_pin` — 3/3 PASS (including new raw_count_equals_set_size assertion)
- `cargo +stable test -p marque-capco --test lattice_static_assertions` — 0/0 (compile-time only, file builds clean)
- `cargo +stable test -p marque-capco --test post_3b_registration_pin` — 1/1 PASS (38 unchanged)
- `cargo +stable clippy --workspace --all-targets -- -D warnings` — 0 warnings
- `cargo +stable fmt --check` — clean
- `cargo run --manifest-path tools/citation-lint/Cargo.toml --release -- .` — 0 defects

Per project memory `feedback_audit_predicates_against_source`
(Constitution VIII propagation discipline) and
`feedback_double_check_the_plan` (when the architect plan and
ground truth conflict, ground truth wins; the architect plan was
a preflight draft, the merged code is authoritative). Per project
memory `feedback_walked_adjacencies` (every fix walks the related
code paths) — the DeclassExemptionAccumulator fix #3 is a textbook
walked-adjacency from the R1 DeclassifyOnLattice fix that I missed
the first time; the baseline correction #6 is a walked-adjacency
from the architect plan's count that I trusted instead of verifying.
bashandbone added a commit that referenced this pull request May 19, 2026
Copilot R2 ran on commit a8a7065 (the R2 fix-ups rebased on
staging-merge) and surfaced 4 valid findings. All addressed:

1. **lattice_static_assertions.rs — bounded-shape negative locks
   missing.** The doc-comment header said only ClassificationLattice
   and NatoClassLattice are bounded, but the test file only asserted
   the POSITIVE bounded impls. The other 10 lattice types had no
   explicit lock against accidental `BoundedJoinSemilattice` /
   `BoundedMeetSemilattice` addition. Asymmetric with the Join-only
   negative locks below (DissemSet / JointSet / DisplayOnlyBlock).
   Fix: add 10 × `assert_not_impl_any!(<T>: BoundedJoinSemilattice,
   BoundedMeetSemilattice)` blocks closing the symmetry. Now an
   accidental `top()` impl on SciSet / SarSet / FgiSet / AeaSet /
   etc. is a build error.

2. **CLAUDE.md:283 — absolute-vs-delta confusion in per-axis
   net-delta math.** The line read `9 → 12 Join impls / 6 → 9 Meet
   impls` as if those were start → end counts. But the attestation
   table establishes pre-4b Join baseline = 3 (Sci/Sar/Fgi only) and
   terminal = 12, so the start → end is `3 → 12` Join / `3 → 9`
   Meet. The `9 → 12 / 6 → 9` form was conflating the +9 / +6 deltas
   with absolute counts. Fix: rewrite as `3 → 12 Join impls (pre-4b:
   Sci/Sar/Fgi; +AeaSet 4b-A; +7 in 4b-B; +DisplayOnlyBlock 4b-E) /
   3 → 9 Meet impls (pre-4b: Sci/Sar/Fgi; +AeaSet 4b-A; +5 in 4b-B
   with Joint/Dissem Join-only per PR #456)`. The +9 and +6 deltas
   are still derivable from the inline per-sub-PR contributions but
   no longer presented as start values.

3. **pm-decisions.md:101 — OQ-RUST-4 still says T134, not T146.**
   Walked-adjacency miss within the same file: the ERRATA block at
   the top of pm-decisions correctly notes T130-T134 are taken by
   PR-8/9 and the closeout range is T142-T146 with T146 reserved
   for the SupersessionSet deferral, but the OQ-RUST-4 resolution
   §1 still referenced T134. Fix: T134 → T146 with a back-pointer
   to the ERRATA so a future reader who lands at OQ-RUST-4 first
   sees both the corrected ID and the rationale.

4. **PR description body — stale "14 → 15" claim.** The R1 fix-up
   commit message and the PR description's "R1 fix-ups landed in
   32c33bb" section claimed I'd corrected the pre-4b PageRewrite
   baseline 14 → 15. The R2 fix-up commit (b6dce2fe, now
   a8a7065 post-rebase) reverted that to 14 with documented
   ground-truth verification from `5fd5a339`. The PR description
   body is GitHub-side, not in-tree, so it didn't pick up the R2
   correction. Fix: update the PR body via `gh pr edit` (separate
   from this in-tree commit) to remove the stale claim.

Verification gate (all GREEN):
- `cargo +stable build --tests -p marque-capco` — 10 new
  `assert_not_impl_any!` blocks compile clean (any future
  accidental `BoundedJoin*` impl on the locked types becomes a
  build error)
- `cargo +stable test -p marque-capco --test
  lattice_static_assertions` — 0/0 (compile-time only)
- `cargo +stable test -p marque-capco --test
  post_4b_lattice_inventory_pin` — 3/3 PASS
- `cargo +stable test -p marque-capco --test
  post_3b_registration_pin` — 1/1 PASS
- `cargo +stable clippy --workspace --all-targets -- -D warnings` — 0 warnings
- `cargo +stable fmt --check` — clean
- `cargo run --manifest-path tools/citation-lint/Cargo.toml --release -- .` — 0 defects

Per project memory `feedback_walked_adjacencies`: the T134→T146 fix
(#3) is the same walked-adjacency-within-a-file pattern that bit me
on the DeclassExemptionAccumulator citation in R1; same root cause
(I updated some references to T142-T146 but missed OQ-RUST-4). Per
memory `feedback_audit_predicates_against_source` (Constitution VIII
propagation): the CLAUDE.md absolute-vs-delta fix (#2) is the kind
of summary-stats drift that accumulates across closeout iterations
when the document is treated as a chronicle instead of a current
state snapshot.
bashandbone added a commit that referenced this pull request May 19, 2026
…talog drift pins + CI job (#557)

fix #3 is a textbook
fix (#2) is the kind
bashandbone added a commit that referenced this pull request May 20, 2026
…ings)

Addresses CI Format+Lint failure plus 5 Copilot inline review findings
on PR #627. Trade-offs:

* CI Format+Lint failure (stable rustfmt) — `cargo fmt --all` applied;
  affects `crates/scheme/src/render_context.rs` test module's const
  declarations (stable rustfmt expanded multi-line; local nightly
  rustfmt rendered single-line). CI gate now green.

* Copilot finding #1 (type-safety hole) — `SectionRef::with_sub_subsection`
  allowed constructing `subsection: None + sub_subsection: Some(N)`
  which Display would emit as misleading `§H.N pNN` form. Architect F-5
  had already flagged `sub_subsection` as dead capability (CAPCO-2016
  has zero `§X.Y.Z`-style citations in the manual; the original
  scaffolding was speculative grow-path under YAGNI). Code-reviewer F-3
  flagged the synthetic test name implying a real citation.
  Resolution: retired `sub_subsection` entirely — field + builder +
  Display branch + `_SAR_SUB_SUB` const + `display_subsection_plus_
  sub_subsection_h5_4_p99` test + scanner regex sub_subsection branch +
  proptest generator branch. Closes Copilot #1, architect F-5,
  code-reviewer F-3 by elimination. If a future authoritative source
  introduces 3-level subsections, the field re-extends additively via
  `#[non_exhaustive]`.

* Copilot finding #2 (Display vs citation-lint regex divergence) —
  module-level doc-comment in `citation.rs` claimed Display emits the
  exact `tools/citation-lint` parser shape, but the lint's parser at
  `tools/citation-lint/src/citation.rs::Citation` carries only
  `{ section, subsection, pages, raw }` — no structural Table field.
  Softened the doc comment to acknowledge the `Table <N>` structural-
  parsing extension lands alongside 3c.2.C's `&'static str → Citation`
  migration; the lint today tolerates `Table` as an occurrence anchor
  without parsing it into its own field. C-FOLLOWUP-1 in the deferred-
  findings doc already tracks this work.

* Copilot finding #3 (test scanner doc divergence) — same gap as #2,
  in `citation_display_roundtrip.rs` module doc; updated.

* Copilot finding #4 (PM doc commit count) — PM doc §2 said "Five
  logical commits" listing A1–A5, but PR has A6 (reviewer-pass
  cleanup) and now A7 (this commit). Updated to "Seven commits land
  in PR 3c.2.A; A1–A5 are the logical scaffolding commits, A6 is the
  reviewer-pass cleanup, A7 is the Copilot-review cleanup." with full
  A6/A7 rows in the commit table.

* Copilot finding #5 (PM doc 26 → 23 incomplete erratum, also the
  Copilot-suppressed low-confidence comment) — PM-9 erratum at A6
  landed but missed R-A3 (§3 risk register), the A4 commit table row
  (§2), and the reviewer attestation checklist (§4). Replaced all
  remaining "26" references with "23" using replace_all.

Deferred-findings doc updated: GH-FOLLOWUP-2 and GH-FOLLOWUP-3 both
marked CLOSED IN A7 (resolved by removal). Addressed-4, Addressed-5,
Addressed-6 sections added for the A7-resolved items.

Gates:
- `cargo fmt --all -- --check` clean (the CI gate that failed)
- `cargo check --workspace` clean
- `cargo nextest run --workspace` — 2635 passed, 11 skipped (net -1
  test from the retired `display_subsection_plus_sub_subsection_h5_4_p99`
  fixture; expected and intentional)
- `cargo clippy --workspace --tests` — clean for touched crates

Ref: PR #627 Copilot review findings (5 inline + 1 suppressed low-confidence)
bashandbone added a commit that referenced this pull request May 22, 2026
…an guidance (PR #681)

Three Copilot round-2 comments addressed, all on doc-comment / inline-
comment accuracy (no logic changes):

1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`).
   Doc-comment claimed `Some(span)` once a Banner candidate "has been
   processed on the page". Accurate post-correction: the engine only
   records `page_banner_span` AFTER the decoder-confidence gate (per
   issue #471 — sub-threshold decoder banner recognitions `continue`
   before the capture site). Rewording: "once a Banner candidate has
   cleared the engine's decoder confidence gate
   (`prov.recognition_score() >= self.config.confidence_threshold()`)
   and been processed; a sub-threshold decoder banner recognition does
   NOT populate this field — the same discipline that gates
   downstream rule dispatch and `PageContext` accumulation per
   issue #471."

   Validates the capture-site discipline rust-reviewer Q4 verified
   ("the banner-span capture sits AFTER the decoder confidence gate;
   `page_portions.push` applies the same discipline — consistent").

2. **Sub-span discipline — achievable guidance**
   (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the
   source bytes themselves" to target sub-blocks, but `Rule::check(&
   self, &CanonicalAttrs, &RuleContext)` does NOT receive the source
   byte buffer — consumers literally cannot read `&[u8]` slices from a
   `Span` they're handed. Rewording: rules emit a `FixIntent` keyed
   off the FULL banner span and let
   `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical`
   re-render the whole banner from the page-level lattice projection
   (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does
   NOT receive the source byte buffer — the rule cannot read `&[u8]`
   slices itself from the span."

3. **Test outer comment — newline-terminator boundary**
   (`crates/engine/src/engine.rs:6341`). The outer comment above the
   `let src: &[u8] = ...` line still said "ends at the trailing
   newline" — accurate AS POSITIONAL ANCHOR but misleading on the
   `Span` semantics, since the assertion two blocks down asserts
   `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording:
   "ends at the newline-terminator boundary (excluding the `\n`
   itself — the scanner's banner span ends one-past the last
   printable byte, not one-past the terminator)."

   The DIAGRAMMED inner comment (the source-layout diagram inside
   the assertion block) was already accurate post-Copilot-round-1
   fix; this is the OUTER comment that pre-dated the diagram and was
   never updated to match. Walk-adjacent gap closed.

## Test plan

- [x] All 5 `page_banner_span_*` tests pass
- [x] `cargo clippy --workspace --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] All 3 Copilot round-2 inline comments addressed

No code changes — all three were doc-comment / inline-comment
accuracy fixes. The capture-site discipline (#1), sub-span emission
path (#2), and span-endpoint semantics (#3) all match the actual
engine + rule API today; the docs now match too.

Refs: PR #681 Copilot round 2 comments.
bashandbone added a commit that referenced this pull request May 22, 2026
…an guidance (PR #681)

Three Copilot round-2 comments addressed, all on doc-comment / inline-
comment accuracy (no logic changes):

1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`).
   Doc-comment claimed `Some(span)` once a Banner candidate "has been
   processed on the page". Accurate post-correction: the engine only
   records `page_banner_span` AFTER the decoder-confidence gate (per
   issue #471 — sub-threshold decoder banner recognitions `continue`
   before the capture site). Rewording: "once a Banner candidate has
   cleared the engine's decoder confidence gate
   (`prov.recognition_score() >= self.config.confidence_threshold()`)
   and been processed; a sub-threshold decoder banner recognition does
   NOT populate this field — the same discipline that gates
   downstream rule dispatch and `PageContext` accumulation per
   issue #471."

   Validates the capture-site discipline rust-reviewer Q4 verified
   ("the banner-span capture sits AFTER the decoder confidence gate;
   `page_portions.push` applies the same discipline — consistent").

2. **Sub-span discipline — achievable guidance**
   (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the
   source bytes themselves" to target sub-blocks, but `Rule::check(&
   self, &CanonicalAttrs, &RuleContext)` does NOT receive the source
   byte buffer — consumers literally cannot read `&[u8]` slices from a
   `Span` they're handed. Rewording: rules emit a `FixIntent` keyed
   off the FULL banner span and let
   `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical`
   re-render the whole banner from the page-level lattice projection
   (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does
   NOT receive the source byte buffer — the rule cannot read `&[u8]`
   slices itself from the span."

3. **Test outer comment — newline-terminator boundary**
   (`crates/engine/src/engine.rs:6341`). The outer comment above the
   `let src: &[u8] = ...` line still said "ends at the trailing
   newline" — accurate AS POSITIONAL ANCHOR but misleading on the
   `Span` semantics, since the assertion two blocks down asserts
   `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording:
   "ends at the newline-terminator boundary (excluding the `\n`
   itself — the scanner's banner span ends one-past the last
   printable byte, not one-past the terminator)."

   The DIAGRAMMED inner comment (the source-layout diagram inside
   the assertion block) was already accurate post-Copilot-round-1
   fix; this is the OUTER comment that pre-dated the diagram and was
   never updated to match. Walk-adjacent gap closed.

## Test plan

- [x] All 5 `page_banner_span_*` tests pass
- [x] `cargo clippy --workspace --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] All 3 Copilot round-2 inline comments addressed

No code changes — all three were doc-comment / inline-comment
accuracy fixes. The capture-site discipline (#1), sub-span emission
path (#2), and span-endpoint semantics (#3) all match the actual
engine + rule API today; the docs now match too.

Refs: PR #681 Copilot round 2 comments.
bashandbone added a commit that referenced this pull request May 22, 2026
…an guidance (PR #681)

Three Copilot round-2 comments addressed, all on doc-comment / inline-
comment accuracy (no logic changes):

1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`).
   Doc-comment claimed `Some(span)` once a Banner candidate "has been
   processed on the page". Accurate post-correction: the engine only
   records `page_banner_span` AFTER the decoder-confidence gate (per
   issue #471 — sub-threshold decoder banner recognitions `continue`
   before the capture site). Rewording: "once a Banner candidate has
   cleared the engine's decoder confidence gate
   (`prov.recognition_score() >= self.config.confidence_threshold()`)
   and been processed; a sub-threshold decoder banner recognition does
   NOT populate this field — the same discipline that gates
   downstream rule dispatch and `PageContext` accumulation per
   issue #471."

   Validates the capture-site discipline rust-reviewer Q4 verified
   ("the banner-span capture sits AFTER the decoder confidence gate;
   `page_portions.push` applies the same discipline — consistent").

2. **Sub-span discipline — achievable guidance**
   (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the
   source bytes themselves" to target sub-blocks, but `Rule::check(&
   self, &CanonicalAttrs, &RuleContext)` does NOT receive the source
   byte buffer — consumers literally cannot read `&[u8]` slices from a
   `Span` they're handed. Rewording: rules emit a `FixIntent` keyed
   off the FULL banner span and let
   `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical`
   re-render the whole banner from the page-level lattice projection
   (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does
   NOT receive the source byte buffer — the rule cannot read `&[u8]`
   slices itself from the span."

3. **Test outer comment — newline-terminator boundary**
   (`crates/engine/src/engine.rs:6341`). The outer comment above the
   `let src: &[u8] = ...` line still said "ends at the trailing
   newline" — accurate AS POSITIONAL ANCHOR but misleading on the
   `Span` semantics, since the assertion two blocks down asserts
   `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording:
   "ends at the newline-terminator boundary (excluding the `\n`
   itself — the scanner's banner span ends one-past the last
   printable byte, not one-past the terminator)."

   The DIAGRAMMED inner comment (the source-layout diagram inside
   the assertion block) was already accurate post-Copilot-round-1
   fix; this is the OUTER comment that pre-dated the diagram and was
   never updated to match. Walk-adjacent gap closed.

## Test plan

- [x] All 5 `page_banner_span_*` tests pass
- [x] `cargo clippy --workspace --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] All 3 Copilot round-2 inline comments addressed

No code changes — all three were doc-comment / inline-comment
accuracy fixes. The capture-site discipline (#1), sub-span emission
path (#2), and span-endpoint semantics (#3) all match the actual
engine + rule API today; the docs now match too.

Refs: PR #681 Copilot round 2 comments.
bashandbone added a commit that referenced this pull request May 22, 2026
…an guidance (PR #681)

Three Copilot round-2 comments addressed, all on doc-comment / inline-
comment accuracy (no logic changes):

1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`).
   Doc-comment claimed `Some(span)` once a Banner candidate "has been
   processed on the page". Accurate post-correction: the engine only
   records `page_banner_span` AFTER the decoder-confidence gate (per
   issue #471 — sub-threshold decoder banner recognitions `continue`
   before the capture site). Rewording: "once a Banner candidate has
   cleared the engine's decoder confidence gate
   (`prov.recognition_score() >= self.config.confidence_threshold()`)
   and been processed; a sub-threshold decoder banner recognition does
   NOT populate this field — the same discipline that gates
   downstream rule dispatch and `PageContext` accumulation per
   issue #471."

   Validates the capture-site discipline rust-reviewer Q4 verified
   ("the banner-span capture sits AFTER the decoder confidence gate;
   `page_portions.push` applies the same discipline — consistent").

2. **Sub-span discipline — achievable guidance**
   (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the
   source bytes themselves" to target sub-blocks, but `Rule::check(&
   self, &CanonicalAttrs, &RuleContext)` does NOT receive the source
   byte buffer — consumers literally cannot read `&[u8]` slices from a
   `Span` they're handed. Rewording: rules emit a `FixIntent` keyed
   off the FULL banner span and let
   `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical`
   re-render the whole banner from the page-level lattice projection
   (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does
   NOT receive the source byte buffer — the rule cannot read `&[u8]`
   slices itself from the span."

3. **Test outer comment — newline-terminator boundary**
   (`crates/engine/src/engine.rs:6341`). The outer comment above the
   `let src: &[u8] = ...` line still said "ends at the trailing
   newline" — accurate AS POSITIONAL ANCHOR but misleading on the
   `Span` semantics, since the assertion two blocks down asserts
   `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording:
   "ends at the newline-terminator boundary (excluding the `\n`
   itself — the scanner's banner span ends one-past the last
   printable byte, not one-past the terminator)."

   The DIAGRAMMED inner comment (the source-layout diagram inside
   the assertion block) was already accurate post-Copilot-round-1
   fix; this is the OUTER comment that pre-dated the diagram and was
   never updated to match. Walk-adjacent gap closed.

## Test plan

- [x] All 5 `page_banner_span_*` tests pass
- [x] `cargo clippy --workspace --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] All 3 Copilot round-2 inline comments addressed

No code changes — all three were doc-comment / inline-comment
accuracy fixes. The capture-site discipline (#1), sub-span emission
path (#2), and span-endpoint semantics (#3) all match the actual
engine + rule API today; the docs now match too.

Refs: PR #681 Copilot round 2 comments.
bashandbone added a commit that referenced this pull request May 22, 2026
…icate-ID form

Per T044 OD-8.A (PM decision 2026-05-22): the engine constraint-bridge
becomes a no-op pass-through — the catalog row's `name` IS the
canonical predicate ID, with no string translation. This commit
performs the row-name rename in the catalogs and the supporting
dispatch + emit-helper updates inside `marque-capco`.

## Class-floor catalog (27 rows; legacy-rule-id-map §3)

`crates/capco/src/scheme/class_floor.rs` + the
`crates/capco/src/scheme/constraints/class_floor_catalog.rs` mirror
both rename from the slash-form `class-floor/<marking>` /
`E058/<purpose>` to the canonical
`banner.<axis>.<floor|ceiling>-<marking>` form. Five discriminators
appear in the new catalog:

- `banner.classification.floor-*` (12 rows; SCI / SAR / NATO-SAP +
  passthrough rows).
- `banner.aea.floor-*` (7 rows; RD / FRD / TFNI / ATOMAL / CNWDI /
  RD-SG / FRD-SG).
- `banner.aea.ceiling-*` (2 rows; DOD-UCNI / DOE-UCNI).
- `banner.dissem.floor-*` (4 rows; RSEN / IMCON / ORCON / EYES-ONLY).
- `banner.classification.floor-passthrough-*` (4 rows; BUR / HCS-X /
  KLM / MVL).

`is_class_floor_catalog_name` (in `predicates/class_floor.rs`) and
the bridge gate in `adapter.rs::message_by_name` both move from
prefix-check (`starts_with("class-floor/") || starts_with("E058/")`)
to substring-check (`contains(".floor-") || contains(".ceiling-")`).
`floor` and `ceiling` are reserved by convention for the class-floor
catalog; no other rule predicate uses either substring.

## SCI per-system catalog (5 rows; legacy-rule-id-map §4)

`crates/capco/src/scheme/sci_per_system.rs` + the mirror at
`crates/capco/src/scheme/constraints/sci_per_system_catalog.rs`
both rename from `sci-per-system/<purpose>` to the canonical
`marking.sci.<predicate>` form. The 5 rows cover:

- `marking.sci.hcs-o-companions` (§H.4 p64)
- `marking.sci.hcs-p-noforn-required` (§H.4 p66)
- `marking.sci.hcs-p-sub-companions` (§H.4 p68)
- `marking.sci.si-g-companions` (§H.4 p80)
- `marking.sci.tk-compartment-noforn-required` (§H.4 p87 + p91 + p95)

`is_sci_per_system_catalog_name` keeps a `starts_with("marking.sci.")`
prefix-check (the `marking.sci.` prefix is uniquely scoped to the
per-system catalog; `portion.sci.*` is reserved for standalone SCI
rules like E061/E062/E063/W034/E065).

## `RULE_E059` constant deletion (legacy-rule-id-map §6 + plan §2.6)

`pub(crate) const RULE_E059: RuleId = RuleId::new("E059")` in
`sci_per_system.rs` deleted. The walker-shared constant existed
because every per-system emit body fabricated the same E059 rule ID;
post-T044 each row carries its own predicate ID via `row.name`. The
9 `RULE_E059` reference sites in `scheme/actions/companions.rs`
(rows #1, #3, #4 inside `emit_hcs_o_companions` /
`emit_hcs_p_sub_companions` / `emit_si_g_companions`, plus the
`emit_companion_required` dispatcher used by rows #2 and #5) all
migrated to `marque_rules::RuleId::new("capco", row.name)` —
constructed once per emit body at the entry and passed through to
`emit_companion_insert` / `make_fix_diagnostic` as the `rule`
argument.

`pub(crate) use` re-export of `RULE_E059` from `scheme/mod.rs`
removed accordingly.

Citations re-verified against `crates/capco/docs/CAPCO-2016.md` per
Constitution VIII; no `Citation` struct field touched.

`cargo check -p marque-capco` clean after this commit.

Refs: docs/refactor-006/legacy-rule-id-map.md §3 + §4,
docs/refactor-006/2026-05-22-T044-rule-id-tuple-plan.md §2.3 + §2.6,
docs/refactor-006/2026-05-22-T044-pm-decisions.md OD-8.A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bashandbone added a commit that referenced this pull request May 22, 2026
 empirically rejected)

Copilot's PR #697 review surfaced three inline comments:

**#1 (lib.rs:842) — accepted**: doc-comment claimed R2 "tightens the
runtime-config surface relative to the derive". Copilot correctly
pointed out that the pre-R2 `#[derive(Deserialize)]` on
`Option<String>` / `Option<f32>` / `Option<HashMap<String,String>>` /
`Option<f64>` was ALREADY type-strict and ALREADY had a closed
accepted-field set. There is no concrete pre-R2 case that was
accepted and is now rejected. The synthesis brief AND the rust-
specialist review both misclaimed "tightening"; Copilot caught it.
Reworded the `WasmConfig` doc-comment to state the post-R2 surface
matches the pre-R2 Derive exactly — same four field names, same
`Option<T>` shape, same loud-failure-on-mismatch, same silent-ignore.
The R2 change is structural (binary-size cut + improved error
messages with field names), NOT a tightening.

**#3 (lib.rs:1104) — accepted**: f32 format-parse roundtrip comment
claimed it "runs only on engine-cache-miss" but `build_cache_key`
runs on every `parse_wasm_config` call (which is every
lint_native / fix_native call), regardless of whether the resulting
cache key hits or misses the engine cache. Corrected the comment.

**#2 (lib.rs:1124) — empirically REJECTED**: Copilot suggested
replacing the `serde_json::Map`-with-cloned-keys path with a
borrow-preserving `Vec<(&str,&str)>` sort + direct JSON string
construction, to eliminate per-call `corrections` key/value clones.
The suggestion is theoretically correct on runtime: O(n) String
clones become O(n) borrow-preserving sort comparisons.

But empirically — measured locally with
`wasm-pack build --target web --profile release-web` (CI-gate
profile) — the borrow-preserving rewrite REGRESSED WASM bundle
size from 1,364,312 B (original R2) to 1,370,786 B (with Copilot's
suggested rewrite). That's a +6,474 B regression — would erase 83%
of R2's −7,750 B binary win.

The mechanism is the same LTO+ICF lesson captured in
PR #696 (R1's `sorted_entries` revert at
`crates/capco/src/lattice.rs:249-278`): the `serde_json::Map` /
`Value::Object` / `to_string::<Value>` path REUSES serde_json
machinery already monomorphized for the Diagnostic JSON output side
of `marque-wasm`. The direct-String-construction path forces fresh
monomorphizations of `serde_json::to_string::<str>` plus a longer
push-based code path. Theoretical clone-count "wins" don't beat
empirical binary-size measurements.

The runtime cost Copilot flagged is real but small: typical
corrections maps in marque usage are 5-20 entries; per-call cost
is 10-40 short String allocations, in the noise relative to the
~400 µs lint hot path. The binary savings (6.5 KB) dwarf the
runtime cost. Trade favors keeping the original implementation.

Documented the trade in the `build_cache_key` doc-comment with a
cross-reference to PR #696's empirical-revert precedent — future
maintainers see WHY the seemingly-suboptimal clones are preserved.

Also brought along: structural Some-and-non-empty proof via
`.filter()` (rust-specialist nit from review fix-up b974550)
stays in place; that one didn't depend on the body shape Copilot
suggested.

Empirical measurements (wasm-pack --profile release-web, local):
- staging baseline: 1,372,062 B
- original R2 (b974550, serde_json::Map): 1,364,312 B (−7,750)
- Copilot #2 fix (direct String): 1,370,786 B (−1,276) ← rejected
- this commit (revert + doc fixes): 1,364,386 B (−7,676)

cargo +stable clippy --workspace --all-targets --all-features -- -D warnings: green
cargo test -p marque-wasm --lib r2_: 19/19 green (byte-identity preserved)
bashandbone added a commit that referenced this pull request May 22, 2026
…ref)

Three small fixes from the 3-reviewer panel on the lattice split:

1. crates/capco/src/lattice/sar.rs:87,97 — replaced two
   `[`super::helpers::HierarchicalTreeSet::join_with`]` /
   `[::meet_with]` intra-doc links with prose. `HierarchicalTreeSet`
   is `pub(super)` and therefore not a valid public doc-link target;
   the bracketed form fired `rustdoc::broken_intra_doc_links`. Prose
   form: "Delegates to the internal `HierarchicalTreeSet::join_with`
   method in `super::helpers`."

2. crates/capco/src/lattice/sci.rs:181,199 — same fix at the two
   analogous SciSet sites.

3. crates/capco/src/lattice/nato_dissem.rs:35 — replaced the stale
   cross-file reference "See DissemSet doc above for rationale."
   (which was coherent in the monolith where DissemSet was defined
   ~300 lines above NatoDissemSet) with an explicit intra-doc link
   to `[`super::dissem::DissemSet`]` that resolves correctly post-
   split.

Reviewers + verdicts:
- Rust-specialist:    APPROVE-WITH-FIXES (#1, #2 above; one MEDIUM
  `test_support` cfg-gate finding was incorrect — gate is present at
  `mod.rs:57` ahead of the `pub(crate) mod test_support;` line)
- General reviewer:   APPROVE-WITH-FIXES (#3 above; plus a non-
  blocking follow-up note about stale `lattice.rs:NNNN` line-number
  anchors in 6 integration-test prose comments — out of scope; will
  file as separate tracking issue)
- Lattice-consultant: APPROVE (no fixes required; all algebraic-law
  invariants, citations, and 82-test parity verified intact)

Verification post-fix:
- `cargo +stable check --workspace --all-targets`              PASS
- `cargo +stable clippy --workspace --all-targets -- -D warnings` PASS
- `cargo +stable fmt --all -- --check`                         PASS
- `cargo test --workspace`                                     PASS
  (198 test binaries, zero failures)
- `cargo +stable doc -p marque-capco --no-deps`: 54 warnings
  (down from 59 pre-fix — the 4 new HierarchicalTreeSet intra-doc
  warnings are gone; all 54 remaining are pre-existing, none in
  `crates/capco/src/lattice/`)
bashandbone added a commit that referenced this pull request May 22, 2026
…ref)

Three small fixes from the 3-reviewer panel on the lattice split:

1. crates/capco/src/lattice/sar.rs:87,97 — replaced two
   `[`super::helpers::HierarchicalTreeSet::join_with`]` /
   `[::meet_with]` intra-doc links with prose. `HierarchicalTreeSet`
   is `pub(super)` and therefore not a valid public doc-link target;
   the bracketed form fired `rustdoc::broken_intra_doc_links`. Prose
   form: "Delegates to the internal `HierarchicalTreeSet::join_with`
   method in `super::helpers`."

2. crates/capco/src/lattice/sci.rs:181,199 — same fix at the two
   analogous SciSet sites.

3. crates/capco/src/lattice/nato_dissem.rs:35 — replaced the stale
   cross-file reference "See DissemSet doc above for rationale."
   (which was coherent in the monolith where DissemSet was defined
   ~300 lines above NatoDissemSet) with an explicit intra-doc link
   to `[`super::dissem::DissemSet`]` that resolves correctly post-
   split.

Reviewers + verdicts:
- Rust-specialist:    APPROVE-WITH-FIXES (#1, #2 above; one MEDIUM
  `test_support` cfg-gate finding was incorrect — gate is present at
  `mod.rs:57` ahead of the `pub(crate) mod test_support;` line)
- General reviewer:   APPROVE-WITH-FIXES (#3 above; plus a non-
  blocking follow-up note about stale `lattice.rs:NNNN` line-number
  anchors in 6 integration-test prose comments — out of scope; will
  file as separate tracking issue)
- Lattice-consultant: APPROVE (no fixes required; all algebraic-law
  invariants, citations, and 82-test parity verified intact)

Verification post-fix:
- `cargo +stable check --workspace --all-targets`              PASS
- `cargo +stable clippy --workspace --all-targets -- -D warnings` PASS
- `cargo +stable fmt --all -- --check`                         PASS
- `cargo test --workspace`                                     PASS
  (198 test binaries, zero failures)
- `cargo +stable doc -p marque-capco --no-deps`: 54 warnings
  (down from 59 pre-fix — the 4 new HierarchicalTreeSet intra-doc
  warnings are gone; all 54 remaining are pre-existing, none in
  `crates/capco/src/lattice/`)
bashandbone added a commit that referenced this pull request May 22, 2026
…odule (≤800 each) (#703)

FIXES (#1, #2 above; one MEDIUM
FIXES (#3 above; plus a non-
bashandbone added a commit that referenced this pull request May 23, 2026
…n + correct overclaiming header

Copilot R2 #3: two issues in decoder_split_byte_identity.rs:

(a) The "Byte-identity regression suite" header overclaimed. The test
compares semantic Parsed<CapcoMarking> outcomes, not raw bytes — the
engine doesn't surface canonicalized bytes at the recognizer boundary.
Reworded to "Recognizer-outcome regression suite" with explicit
"compares parsed shapes, not raw bytes" wording.

(b) decoder_recovers_stray_char_slash_recovery used an "either branch
is acceptable" match arm with `let _ = candidates;`, accepting any
outcome including zero candidates. Probed actual behavior for
(S/X/NF): the collapse-stray pass produces Unambiguous(Secret) cleanly.
Replaced the defensive match with a strong assertion matching observed
behavior: Parsed::Unambiguous + Classification::Secret + dissem_us_count == 1
(NOFORN preserved).

18/18 byte-identity tests pass with the strengthened assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bashandbone added a commit that referenced this pull request May 23, 2026
…above the unconditional Unambiguous assertion

Copilot R4 #3: my R2 commit (ce2c15f) strengthened the
decoder_recovers_stray_char_slash_recovery match into an
unconditional `let Parsed::Unambiguous(m) = parsed else { panic!() }`
but left the original "even if the gate doesn't reach Unambiguous,
a non-empty Ambiguous result is acceptable here" preamble in place.
The comment now contradicts the assertion.

Replaced the preamble with a single-paragraph explanation matching
the actual behavior: the collapse-stray pass strips the X and the
scorer reaches Unambiguous(Secret + NOFORN) cleanly; runner-up
margin is wide enough that the null gate does not intervene.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bashandbone added a commit that referenced this pull request May 24, 2026
…odule (≤800 each) (#703)

FIXES (#1, #2 above; one MEDIUM
FIXES (#3 above; plus a non-
bashandbone added a commit that referenced this pull request May 24, 2026
…odule (≤800 each) (#703)

FIXES (#1, #2 above; one MEDIUM
FIXES (#3 above; plus a non-
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants