feat: marque MVP foundation — spec, codegen, engine pipeline#3
Conversation
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfeat: implement Phase 2 — CVE codegen, engine pipeline, validators�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mPhase 2 (Foundational) lands the real ODNI CVE XML → Rust codegen,�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231mthe FixProposal/AppliedFix type split, FR-016 deterministic fix�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231mordering with overlap guard, layered config with hard-fail�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231mvalidators, and a Clock-injected engine for deterministic audit�[0m
�[38;5;238m 7�[0m �[38;5;238m│�[0m �[38;5;231mtests. Also addresses 14 review findings from the Phase 2�[0m
�[38;5;238m 8�[0m �[38;5;238m│�[0m �[38;5;231mlocal review (4 CRITICAL + 10 HIGH + 10 MEDIUM + 9 LOW).�[0m
�[38;5;238m 9�[0m �[38;5;238m│�[0m
�[38;5;238m 10�[0m �[38;5;238m│�[0m �[38;5;231mPhase 2 core work:�[0m
�[38;5;238m 11�[0m �[38;5;238m│�[0m �[38;5;231m- marque-ism/build.rs: real CVE XML and XSD parsing producing�[0m
�[38;5;238m 12�[0m �[38;5;238m│�[0m �[38;5;231m typed enums (SciControl, DissemControl, SarIdentifier,�[0m
�[38;5;238m 13�[0m �[38;5;238m│�[0m �[38;5;231m DeclassExemption, ExemptFrom) + TRIGRAPHS + ALL_CVE_TOKENS +�[0m
�[38;5;238m 14�[0m �[38;5;238m│�[0m �[38;5;231m validators + migrations, with T010 schema-version pinning�[0m
�[38;5;238m 15�[0m �[38;5;238m│�[0m �[38;5;231m- marque-rules: FixProposal (pure, timestamp-free) vs AppliedFix�[0m
�[38;5;238m 16�[0m �[38;5;238m│�[0m �[38;5;231m (engine-only promotion via __engine_promote), FixSource enum,�[0m
�[38;5;238m 17�[0m �[38;5;238m│�[0m �[38;5;231m Severity::Off as first variant, Diagnostic carries citation�[0m
�[38;5;238m 18�[0m �[38;5;238m│�[0m �[38;5;231m- marque-engine: Engine with Clock injection, FixMode::{Apply,�[0m
�[38;5;238m 19�[0m �[38;5;238m│�[0m �[38;5;231m DryRun}, Config-driven severity overrides, FR-016 sort�[0m
�[38;5;238m 20�[0m �[38;5;238m│�[0m �[38;5;231m- marque-config: layered load with FR-010 (committed [user]�[0m
�[38;5;238m 21�[0m �[38;5;238m│�[0m �[38;5;231m refusal), FR-011 (schema version match), confidence_threshold�[0m
�[38;5;238m 22�[0m �[38;5;238m│�[0m �[38;5;231m validation, MARQUE_CONFIDENCE_THRESHOLD env binding�[0m
�[38;5;238m 23�[0m �[38;5;238m│�[0m �[38;5;231m- marque-capco: E001 banner abbreviation, E002 missing/misordered�[0m
�[38;5;238m 24�[0m �[38;5;238m│�[0m �[38;5;231m USA trigraph, E004 separator count (stub), E005 declass in banner�[0m
�[38;5;238m 25�[0m �[38;5;238m│�[0m �[38;5;231m- marque-core/parser: uses generated enum parse() methods for�[0m
�[38;5;238m 26�[0m �[38;5;238m│�[0m �[38;5;231m SCI/dissem/SAR/declass; Classification remains hand-written�[0m
�[38;5;238m 27�[0m �[38;5;238m│�[0m �[38;5;231m- Clock trait + SystemClock/FixedClock for deterministic audit�[0m
�[38;5;238m 28�[0m �[38;5;238m│�[0m �[38;5;231m timestamps in tests�[0m
�[38;5;238m 29�[0m �[38;5;238m│�[0m
�[38;5;238m 30�[0m �[38;5;238m│�[0m �[38;5;231mCritical review findings:�[0m
�[38;5;238m 31�[0m �[38;5;238m│�[0m �[38;5;231m- C-1 Overlap guard in Engine::fix_inner drops overlapping fixes�[0m
�[38;5;238m 32�[0m �[38;5;238m│�[0m �[38;5;231m after FR-016 sort; dropped fixes surface in remaining_diagnostics�[0m
�[38;5;238m 33�[0m �[38;5;238m│�[0m �[38;5;231m so they cannot be silently lost�[0m
�[38;5;238m 34�[0m �[38;5;238m│�[0m �[38;5;231m- C-2 MARQUE_CONFIDENCE_THRESHOLD parse failure returns�[0m
�[38;5;238m 35�[0m �[38;5;238m│�[0m �[38;5;231m ConfigError::InvalidEnvVar instead of silently using default�[0m
�[38;5;238m 36�[0m �[38;5;238m│�[0m �[38;5;231m- C-3 Dropped latent dual-Classification ambiguity: build.rs no�[0m
�[38;5;238m 37�[0m �[38;5;238m│�[0m �[38;5;231m longer emits a generated Classification enum (canonical lives�[0m
�[38;5;238m 38�[0m �[38;5;238m│�[0m �[38;5;231m in attrs.rs); validator uses literal matches!�[0m
�[38;5;238m 39�[0m �[38;5;238m│�[0m �[38;5;231m- C-4 to_rust_ident resolves idents through resolve_idents which�[0m
�[38;5;238m 40�[0m �[38;5;238m│�[0m �[38;5;231m asserts non-empty and unique variants at build time�[0m
�[38;5;238m 41�[0m �[38;5;238m│�[0m
�[38;5;238m 42�[0m �[38;5;238m│�[0m �[38;5;231mHigh review findings:�[0m
�[38;5;238m 43�[0m �[38;5;238m│�[0m �[38;5;231m- H-1 Server calls marque_config::load() and runs FR-011 check�[0m
�[38;5;238m 44�[0m �[38;5;238m│�[0m �[38;5;231m instead of Config::default() fallback�[0m
�[38;5;238m 45�[0m �[38;5;238m│�[0m �[38;5;231m- H-2/H-3 Engine::fix_with_threshold plumbs --confidence CLI flag�[0m
�[38;5;238m 46�[0m �[38;5;238m│�[0m �[38;5;231m and FixRequest.confidence_threshold; InvalidThreshold error type�[0m
�[38;5;238m 47�[0m �[38;5;238m│�[0m �[38;5;231m- H-4 CLI current_dir() failure → exit 74 (EX_IOERR); load�[0m
�[38;5;238m 48�[0m �[38;5;238m│�[0m �[38;5;231m failures exit with ConfigError::exit_code instead of silent�[0m
�[38;5;238m 49�[0m �[38;5;238m│�[0m �[38;5;231m Config::default fallback�[0m
�[38;5;238m 50�[0m �[38;5;238m│�[0m �[38;5;231m- H-5 Trigraph inner field private; try_new enforces ASCII�[0m
�[38;5;238m 51�[0m �[38;5;238m│�[0m �[38;5;231m uppercase; as_str is infallible via from_utf8_unchecked with�[0m
�[38;5;238m 52�[0m �[38;5;238m│�[0m �[38;5;231m a SAFETY comment documenting the invariant�[0m
�[38;5;238m 53�[0m �[38;5;238m│�[0m �[38;5;231m- H-6 marque-config validates rule severity strings at load time�[0m
�[38;5;238m 54�[0m �[38;5;238m│�[0m �[38;5;231m via Severity::parse_config; ConfigError::UnknownSeverity on typo�[0m
�[38;5;238m 55�[0m �[38;5;238m│�[0m �[38;5;231m- H-7 applied_fixes is HashSet<(RuleId, Span)> for O(1) filter�[0m
�[38;5;238m 56�[0m �[38;5;238m│�[0m �[38;5;231m- H-8 ALL_CVE_TOKENS collected via BTreeSet (sorted, deduplicated)�[0m
�[38;5;238m 57�[0m �[38;5;238m│�[0m �[38;5;231m so canonicalize uses binary_search; NOFORN/ORCON/PROPIN/IMCON�[0m
�[38;5;238m 58�[0m �[38;5;238m│�[0m �[38;5;231m duplicates eliminated�[0m
�[38;5;238m 59�[0m �[38;5;238m│�[0m �[38;5;231m- H-9 parse_classification doc explains it is the single�[0m
�[38;5;238m 60�[0m �[38;5;238m│�[0m �[38;5;231m deliberately hand-coded path (both portion and banner forms)�[0m
�[38;5;238m 61�[0m �[38;5;238m│�[0m �[38;5;231m- H-10 45 unit tests backfill core-crate coverage�[0m
�[38;5;238m 62�[0m �[38;5;238m│�[0m
�[38;5;238m 63�[0m �[38;5;238m│�[0m �[38;5;231mMedium/Low:�[0m
�[38;5;238m 64�[0m �[38;5;238m│�[0m �[38;5;231m- FixProposal::new confidence is assert! (release-mode), not�[0m
�[38;5;238m 65�[0m �[38;5;238m│�[0m �[38;5;231m debug_assert — NaN and INFINITY are correctness bugs in release�[0m
�[38;5;238m 66�[0m �[38;5;238m│�[0m �[38;5;231m- Span::new upgraded to assert!; Span::try_as_slice checked variant�[0m
�[38;5;238m 67�[0m �[38;5;238m│�[0m �[38;5;231m- RuleContext zone/position carry TODO(phase-3) marker�[0m
�[38;5;238m 68�[0m �[38;5;238m│�[0m �[38;5;231m- AppliedFix classifier_id and input are Arc<str> for O(1) clone�[0m
�[38;5;238m 69�[0m �[38;5;238m│�[0m �[38;5;231m- BatchEngine lint_many/fix_many yield Result<_, BatchError>�[0m
�[38;5;238m 70�[0m �[38;5;238m│�[0m �[38;5;231m instead of aborting the process on JoinError�[0m
�[38;5;238m 71�[0m �[38;5;238m│�[0m �[38;5;231m- Severity has as_str + Display; CLI/server/wasm use it instead�[0m
�[38;5;238m 72�[0m �[38;5;238m│�[0m �[38;5;231m of format!("{:?}")�[0m
�[38;5;238m 73�[0m �[38;5;238m│�[0m �[38;5;231m- RuleId inner field private with new/as_str accessors�[0m
�[38;5;238m 74�[0m �[38;5;238m│�[0m �[38;5;231m- FixedClock tuple field private with new constructor�[0m
�[38;5;238m 75�[0m �[38;5;238m│�[0m �[38;5;231m- LintResult::fix_count requires d.fix.is_some() AND Severity::Fix�[0m
�[38;5;238m 76�[0m �[38;5;238m│�[0m �[38;5;231m- Eliminated double source.to_vec() allocation in FixMode::DryRun�[0m
�[38;5;238m 77�[0m �[38;5;238m│�[0m �[38;5;231m- make_fix_diagnostic takes FixDiagnosticParams struct�[0m
�[38;5;238m 78�[0m �[38;5;238m│�[0m �[38;5;231m- Removed unused thiserror/phf/anyhow runtime deps from marque-ism�[0m
�[38;5;238m 79�[0m �[38;5;238m│�[0m �[38;5;231m- marque_ism::SCHEMA_VERSION re-exported at crate root�[0m
�[38;5;238m 80�[0m �[38;5;238m│�[0m �[38;5;231m- run_metadata exits 69 (EX_UNAVAILABLE) instead of 0�[0m
�[38;5;238m 81�[0m �[38;5;238m│�[0m
�[38;5;238m 82�[0m �[38;5;238m│�[0m �[38;5;231mTests: 45 unit + 1 doctest, zero failures. Clippy clean across�[0m
�[38;5;238m 83�[0m �[38;5;238m│�[0m �[38;5;231m--all-targets. rustfmt clean.�[0m
�[38;5;238m 84�[0m �[38;5;238m│�[0m
�[38;5;238m 85�[0m �[38;5;238m│�[0m �[38;5;231mCo-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
There was a problem hiding this comment.
Pull request overview
Foundational MVP work for “marque” across spec/contracts, schema/codegen, core parsing types, config loading/validation, and engine fix application (ordering/overlap guards, clock injection), plus CLI/server/WASM plumbing updates.
Changes:
- Introduces
marque-ismwith generated ODNI CVE-derived enums/tables and token utilities; updates core scanner/parser to use typed attributes and renamed candidate types. - Refactors rules/engine contracts (
FixProposalvsAppliedFix, deterministic fix ordering, overlap guard, injected clock) and propagates API changes through CLI/server/WASM. - Implements layered config loading with hard-fail validators (schema pinning, severity/threshold validation) and improved error handling/exit codes.
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| marque/src/main.rs | CLI config loading hard-fail + fix_with_threshold wiring + JSON severity output updates |
| crates/marque-wasm/src/lib.rs | WASM wrapper updates for new engine fix API + stable severity/message serialization |
| crates/marque-server/src/main.rs | Server now loads layered config + request-level confidence override support |
| crates/marque-rules/src/lib.rs | New contracts: RuleId constructor, Severity::Off, FixProposal/AppliedFix, Diagnostic::new |
| crates/marque-ism/src/token_set.rs | Switches token data to build.rs-generated ALL_CVE_TOKENS/TRIGRAPHS + binary_search canonicalize |
| crates/marque-ism/src/span.rs | Stronger span invariants + try_as_slice + candidate rename to MarkingCandidate |
| crates/marque-ism/src/lib.rs | Re-exports generated CVE enums + schema version constant + updated span exports |
| crates/marque-ism/src/attrs.rs | Typed CVE-derived attribute fields + hardened Trigraph construction/infallible as_str |
| crates/marque-ism/Cargo.toml | Dependency updates for codegen/build hardening (toml, memchr in build) |
| crates/marque-ism/build.rs | Real ODNI CVE/XSD parsing → generated Rust enums/tables + schema pinning assertion |
| crates/marque-extract/src/lib.rs | Re-export ordering cleanup |
| crates/marque-engine/src/pipeline.rs | Updates pipeline candidate type to MarkingCandidate |
| crates/marque-engine/src/output.rs | Fix result now returns AppliedFix; fix_count requires fix.is_some() |
| crates/marque-engine/src/lib.rs | Exposes clock + FixMode/InvalidThreshold + adds default_ruleset() helper |
| crates/marque-engine/src/engine.rs | Adds FixMode, injected clock, fix ordering/overlap guard, threshold override API |
| crates/marque-engine/src/clock.rs | New Clock abstraction with SystemClock and FixedClock |
| crates/marque-engine/src/batch.rs | Batch APIs now return per-document Result via BatchError, and use new fix API |
| crates/marque-core/src/span.rs | Re-export updates for renamed candidate type |
| crates/marque-core/src/scanner.rs | Scanner now returns Vec<MarkingCandidate> |
| crates/marque-core/src/parser.rs | Parser now emits typed CVE enums + updated CAB parsing + trigraph validation via try_new |
| crates/marque-core/src/attrs.rs | Updates re-exports to match new marque-ism::attrs surface |
| crates/marque-config/src/lib.rs | Layered config loader with FR-010/FR-011 hard-fails + severity/env/threshold validation |
| crates/marque-config/Cargo.toml | Adds deps on marque-ism and marque-rules for schema/severity validation |
| crates/marque-capco/src/rules.rs | Updates rules to new FixProposal/Diagnostic::new APIs + rule id changes |
| Cargo.lock | Locks dependency graph changes from added/removed deps |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "diagnostics": result.diagnostics.iter().map(|d| serde_json::json!({ | ||
| "rule": d.rule.to_string(), | ||
| "severity": format!("{:?}", d.severity), | ||
| "severity": d.severity.to_string(), | ||
| "message": d.message, | ||
| "start": d.span.start, |
There was a problem hiding this comment.
In print_json_diagnostics, d.message is a Box<str> (and is behind &Diagnostic from .iter()), so "message": d.message attempts to move out of a shared reference and will not compile. Use a borrowed form like d.message.as_ref() (or d.message.to_string()), and make the same adjustment anywhere else this file formats/prints diag.message or fix.replacement from a borrowed diagnostic/fix.
| let mut applied_rules: Vec<RuleId> = Vec::new(); | ||
| let mut fixes: Vec<_> = lint | ||
| .diagnostics | ||
| .iter() |
There was a problem hiding this comment.
Engine::fix_inner currently auto-applies any FixProposal whose confidence meets the threshold, regardless of the diagnostic's (possibly overridden) Severity. This means a rule configured as warn/error can still have its fix applied during fix, which contradicts the Severity::Fix contract and also disagrees with LintResult::fix_count() (which counts only Severity::Fix + fix.is_some()). Filter fixes by d.severity == Severity::Fix (after overrides) before collecting proposals.
| .iter() | |
| .iter() | |
| .filter(|d| d.severity == Severity::Fix) |
| let result = match engine.fix_with_threshold(&source, mode, confidence) { | ||
| Ok(r) => r, | ||
| Err(e) => { | ||
| eprintln!("error: {e}"); | ||
| return 65; // EX_DATAERR per contracts/cli.md | ||
| } | ||
| }; |
There was a problem hiding this comment.
The CLI currently doesn't populate AppliedFix.input for audit records, even though AppliedFix has an input field intended to be set at the boundary (e.g., file path or "-" for stdin). After obtaining the FixResult, set result.applied[*].input to path.display().to_string() (and consider also wiring stdin as "-") so downstream audit NDJSON can attribute fixes to a specific input.
| let result = match engine.fix_with_threshold(&source, mode, confidence) { | |
| Ok(r) => r, | |
| Err(e) => { | |
| eprintln!("error: {e}"); | |
| return 65; // EX_DATAERR per contracts/cli.md | |
| } | |
| }; | |
| let mut result = match engine.fix_with_threshold(&source, mode, confidence) { | |
| Ok(r) => r, | |
| Err(e) => { | |
| eprintln!("error: {e}"); | |
| return 65; // EX_DATAERR per contracts/cli.md | |
| } | |
| }; | |
| let input = path.display().to_string(); | |
| for applied_fix in &mut result.applied { | |
| applied_fix.input = input.clone(); | |
| } |
| if ctx.marking_type != MarkingType::Banner { | ||
| return vec![]; | ||
| } | ||
| if attrs.declassify_on.is_none() { | ||
| return vec![]; |
There was a problem hiding this comment.
DeclassifyInBannerRule checks attrs.declassify_on for banner candidates, but the parser only populates declassify_on/declass_exemption when parsing CAB lines (see marque-core/src/parser.rs, parse_cab). As a result, this rule will never fire for actual banner markings. Either parse declass tokens into IsmAttributes during banner/portion parsing, or change the rule to look at the field(s) that are actually populated for banner candidates.
| /// Load and merge configuration from standard locations. | ||
| /// | ||
| /// Search order (first found wins for each layer): | ||
| /// 1. `.marque.toml` in current directory or any parent | ||
| /// 1. `.marque.toml` in current directory (committed, project policy) | ||
| /// 2. `.marque.local.toml` alongside the project config (user-specific, gitignored) |
There was a problem hiding this comment.
marque_config::load is documented (and the spec contract in specs/001-marque-mvp/contracts/cli.md) as discovering .marque.toml by walking upward from the current working directory, but this implementation only checks project_root/.marque.toml and never searches parent directories. Either implement the upward walk (stopping at git root/filesystem root per contract) or update the contract/docs so CLI/server behavior is consistent.
| fn merge_user_into(config: &mut Config, file: ConfigFile) { | ||
| config.user.classifier_id = file.user.classifier_id.or(config.user.classifier_id.take()); | ||
| config.user.classification_authority = | ||
| file.user.classification_authority.or(config.user.classification_authority.take()); | ||
| config.user.default_reason = file.user.default_reason.or(config.user.default_reason.take()); | ||
| config.user.derived_from_default = | ||
| file.user.derived_from_default.or(config.user.derived_from_default.take()); | ||
| if let Some(user) = file.user { | ||
| if user.classifier_id.is_some() { | ||
| config.user.classifier_id = user.classifier_id; | ||
| } |
There was a problem hiding this comment.
The .marque.local.toml layer is parsed into ConfigFile, but merge_user_into only applies the [user] section and ignores rules, corrections, capco.version, and confidence_threshold from the local file. This conflicts with the stated precedence chain (.marque.local.toml overriding .marque.toml). Either merge the non-user fields for the local layer as well, or reject them with a clear error to avoid silent no-ops.
| // We emit the enum anyway for type-system completeness. | ||
| let sar_entries = parse_cve_xml(&cve_dir.join("CVEnumISMSAR.xml")); | ||
| if sar_entries.is_empty() { | ||
| // Emit a minimal enum with a placeholder variant so it's usable |
There was a problem hiding this comment.
This comment says emit_empty_enum produces a "placeholder variant", but emit_empty_enum actually emits an empty enum (pub enum SarIdentifier {}), which is uninhabited and has no variants. Either update the comment to reflect the uninhabited type, or emit a real placeholder/unknown variant if you need the enum to be constructible in downstream code.
| // Emit a minimal enum with a placeholder variant so it's usable | |
| // Emit an empty (uninhabited) enum for type-system completeness. |
| // FR-011 (schema version mismatch), and threshold/severity validation. | ||
| // These are intentional safety gates — do not silently fall back to | ||
| // `Config::default()` on error. | ||
| let config = match marque_config::load(&cwd) { | ||
| Ok(c) => c, |
There was a problem hiding this comment.
marque_config::load is called with cwd here, so config resolution is limited to the current directory. The CLI contract (specs/001-marque-mvp/contracts/cli.md "Configuration discovery") describes walking upward to find .marque.toml and then looking for .marque.local.toml alongside it. Consider moving that discovery logic into marque_config (or doing it here) so the CLI matches the contract.
|
@copilot apply changes based on the comments in this thread Let me give some guidance on how this should work functionally. The CAB declassify on date is essentially the furthest date in the future from the observed dates (if any), or typically, 25 years from the date of the material, whichever is longest (some things have shorter default declassifications, like fleeting/tactical information -- typically 10 years -- and others, like HUMINT can default to 50). If there are banner markings with declassification information, then we need a way of identifying that max for a given page or document. Similarly, a banner marking can be thought of as a composite of the portion markings on a given page, or where 'page' doesn't apply, the materials. The banner is the most restrictive sum of its portion marking parts. So architecturally, it's important that we maintain some understanding of at least the maximums observed for a given 'page' An example to illustrate -- The banner here would be If the architecture does not currently allow this kind of resolution, then now is a good time to identify it and brainstorm solutions. |
bashandbone
left a comment
There was a problem hiding this comment.
Second-pass review — would approve (self-PR, GH blocks explicit approval)
Previous comment on this PR was corrupted by an ANSI-injecting cat alias; this is the intended clean version.
Validation: clippy clean, 45 unit + 1 doctest pass, fmt clean, mergeable. No blocking issues.
Full report: .claude/PRPs/reviews/pr-3-review.md (working tree, not committed).
Focus-area results (all clean)
- Overlap guard correctness — walked adjacent, contained, identical, and overlapping span cases under the FR-016 reverse-end sort.
fix.span.end <= previous.span.starthandles all four correctly. Trigraph::as_strfrom_utf8_uncheckedsoundness — no construction path outsidetry_new(ASCII validator) or theUSAconst.Severity::Offgate — short-circuits before any diagnostic is produced.Send/Sync— all bounds correctly declared;BatchEngineArc<Engine>is sound.-0.0/ subnormal handling infix_with_threshold— correct.
Findings: 0 CRITICAL · 3 HIGH · 5 MEDIUM · 4 LOW
HIGH (not blocking, worth a follow-up)
- H-1
BatchErrordoes not distinguish panic from cancellation —BatchError::TaskFailed(JoinError)wrapsJoinErrorwhole, butJoinError::is_panic()vsis_cancelled()matter for CI pipelines that need to alert on bugs vs quietly retry on cancellation. Recommend forwardingis_panic()/is_cancelled()onBatchError. (batch.rs:48–77) - H-2
AppliedFix::__engine_promoteispub #[doc(hidden)]— the doc explicitly admits "enforced by convention and code review, not by the type system." Any downstream crate can construct anAppliedFixbypassing the engine's confidence gate, FR-016 sort, and C-1 overlap guard, injecting arbitrary audit records. For a security tool whose audit log is the compliance output, this is a meaningful integrity gap. Full fix requires crate-boundary restructuring; recommend documenting as a known risk inCLAUDE.mdunder architectural invariants. (rules/lib.rs:246–261) - H-3
fix_with_thresholdhas no test for INFINITY / NEG_INFINITY rejection — the validation correctly rejects both, but the test suite only coversNaN,-0.1,1.1. Two-line test addition. (engine.rs:146–149)
MEDIUM
- M-1
resolve_identshas no non-empty assertion on required CVE files — an emptyCVEnumISMDissem.xmlwould silently produce zero dissem diagnostics. (build.rs:192–209) - M-2 Duplicate-ident collision panic names only one of the two colliding values. (
build.rs:204) - M-3
TRIGRAPHSslice is unsorted sois_trigraphstill uses linear scan;ALL_CVE_TOKENSwent through aBTreeSetforbinary_search. Symmetrical cleanup. (build.rs+token_set.rs:44) - M-4 Threshold filter path has no test at the boundary (0.94 excluded, 0.95 included).
- M-5 Zero-length-span filter path (which masks Phase 2 placeholder spans) has no dedicated test.
LOW
- L-1
BatchOptions.max_concurrent_docsdrives bothConcurrencyControllerandbuffer_unordered; doc comment only mentions one. - L-2
merge_user_intodoes not treat empty-stringclassifier_idas absent. - L-3
Severitydoc describes a future.max()merge design that diverges from the current last-write-winsmerge_project_into. - L-4 Closely related to M-4: all test proposals use
1.0confidence, so theConfig-supplied threshold path is never exercised.
Recommendation
Merge as-is. Follow-up items for the Phase 3 opening commit or a small cleanup pass:
- H-1 (add
is_panic()/is_cancelled()forwarding onBatchError) - H-3 (two-line test for INFINITY rejection)
- M-1, M-4, M-5 (defensive test coverage)
H-2 is a CLAUDE.md documentation update.
…-level aggregation - Fix parse_marking_string to detect DeclassExemption codes (25X1, 50X1-HUM...) and declassification dates (YYYYMMDD, YYYY) in banner/portion blocks so DeclassifyInBannerRule (E005) can actually fire for real banner markings. Add 9 targeted parser unit tests. - Add PageContext type (marque-ism) for page-level portion accumulation: expected_classification (max), expected_sci/sar/dissem_controls (union), expected_rel_to (intersection with NOFORN superseding), expected_declassify_on (max date), expected_declass_exemption (last-seen placeholder for Phase 3). 6 unit tests. - Update RuleContext (marque-rules) to carry Option<Arc<PageContext>> so banner/CAB rules can compare against the expected composite in Phase 3. Removes Copy/PartialEq/Eq derives (non-breaking: rules receive &RuleContext). - Update Engine::lint to build PageContext in document order and pass it through RuleContext for banner/CAB candidates. Arc is built lazily and cached between consecutive non-portion candidates (invalidated on each new portion) to avoid O(candidates * context_size) cloning. Agent-Logs-Url: https://github.com/marquetools/marque/sessions/c74ae0cc-c005-46b0-8501-646eeac340b6 Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Addressed in b8789c8. Here's what was changed and how the architecture now supports the functional requirements you described: E005 fix — PageContext architecture — Added
|
bashandbone
left a comment
There was a problem hiding this comment.
Follow-up review on b8789c8 — E005 parser fix + PageContext architecture
Pulled, built, tested locally against the new commit.
Validation: clippy clean across --all-targets, 60 unit tests + 1 doctest, zero failures. (Growth: +15 tests — marque-core 3→12 for declass parsing, marque-ism 17→23 for PageContext.) Trivial rustfmt drift on two files not blocking — fixing in a follow-up style: commit.
What landed, against the stated requirements
E005 real-rule path: parse_marking_string in marque-core/src/parser.rs now falls through to DeclassExemption::parse(block) and the new is_declass_date(block) predicate, so declass exemption codes (25X1, 50X1-HUM, …) and YYYYMMDD / YYYY free-text dates appearing inside a banner/portion marking populate attrs.declass_exemption / attrs.declassify_on. DeclassifyInBannerRule now fires on real markings instead of being a dead path. ✓
PageContext aggregation semantics (crates/marque-ism/src/page_context.rs):
| Rule | Requirement | Implementation | Verdict |
|---|---|---|---|
expected_classification |
most restrictive wins | .max() over Option<Classification> |
✓ |
expected_sci_controls |
union | BTreeSet collect → Vec |
✓ |
expected_sar_identifiers |
union | BTreeSet collect → Vec |
✓ |
expected_dissem_controls |
union | BTreeSet collect → Vec |
✓ |
expected_rel_to |
intersection, NOFORN supersedes | any_noforn → empty, else intersect |
✓ |
expected_declassify_on |
furthest-out date | lexicographic max_by on &str |
✓ (with caveat — see below) |
expected_declass_exemption |
most specific | last-seen placeholder, Phase 3 TODO | ✓ (acknowledged TODO) |
The (S//NF) + (DEU C//REL TO USA, DEU, NATO) example is covered by the noforn_supersedes_rel_to test: any NOFORN portion zeroes out the expected REL TO, forcing the banner to carry NOFORN instead. Matches the spec exactly.
Engine integration (crates/marque-engine/src/engine.rs): the lint loop builds PageContext in document order, accumulates every MarkingType::Portion parse, and hands banner/CAB candidates an Arc<PageContext> via RuleContext.page_context. The Arc is cached per page-snapshot and invalidated on every new portion — so successive banners in the same page share one allocation, and a banner after new portions gets a fresh snapshot. Clean.
Correctness observations
O-1 — Lexicographic date comparison is subtly correct for this encoding, worth a comment
page_context.rs:186 — .max_by(|a, b| a.cmp(b)) on declass date strings. This works correctly because:
YYYYMMDDvsYYYYMMDD→ pure lex = chronological orderingYYYYvsYYYYMMDDwith shared prefix → shorter (YYYY) < longer (YYYYMMDD), which matches the semantic "YYYY= Jan 1 of YYYY, which is ≤ any date within that year"- Different
YYYYs → lex order = chronological
So the ordering happens to be correct under the "YYYY means Jan 1" convention documented in is_declass_date. Worth a two-line inline comment on the max_by pinning the invariant — if someone changes the encoding to "YYYY = end of year" in the future, the lex compare silently goes wrong in the mixed case.
O-2 — PageContext never resets between pages
engine.rs:82 has the correct TODO(phase-3) noting that page-break detection is a scanner-level concern. Today the context accumulates across the entire document, so a banner on "page 2" sees portion data from "page 1". For Phase 2 single-unit markings this is correct; for paginated documents it over-restricts. Phase 3 scanner work will add page-break candidates that reset the accumulator. No action needed now.
O-3 — page_context.add_portion(parsed.attrs.clone()) clones IsmAttributes per portion
engine.rs:97 — IsmAttributes has Box<[T]> fields whose clone is O(n) in slice contents. For documents with hundreds of portions this adds up. A Phase 3 optimization could store Arc<IsmAttributes> in PageContext::portions so add_portion is O(1). Not a blocker — portion counts per document are typically small.
O-4 — PageContext accessor methods allocate a BTreeSet + Vec per call
expected_sci_controls() etc. recompute the union from scratch on each invocation. A Phase 3 banner-vs-portion validation rule that calls multiple expected_* helpers would re-walk the portions list each time. Memoization (e.g., compute all aggregates once after the last portion is added, stash in OnceCell) is a Phase 3 optimization, not needed today.
O-5 — expected_declass_exemption Phase 3 TODO on duration-aware ordering — already called out in the doc comment. EO 13526 §3.3(b) duration ranking (50X1-HUM > 50X1 > 25X1 etc.) needs a hand-coded comparator because the CVE-defined Ord on DeclassExemption is alphabetical on the CVE token, not semantically ordered by retention length. Correctly deferred.
Rustfmt drift
Two files have trivial single-line reformatting that cargo fmt --check catches:
crates/marque-core/src/parser.rs(13 insertions / some reflow)crates/marque-ism/src/page_context.rs(one.into_boxed_slice()line wrap in a test fixture)
Landing in a follow-up style: commit.
Decision
APPROVE. Copilot's architecture answer is tight and matches the stated functional requirements exactly. The four observations above are Phase 3 polish / documentation, not blockers. The only code change needed on top of b8789c8 is the rustfmt drift — trivial.
Suggested small improvements (can land with the Phase 3 opener or as follow-ups):
- O-1: two-line comment pinning the lex-compare invariant on
expected_declassify_on - O-3:
Arc<IsmAttributes>internally inPageContext::portionsonce portion counts grow - O-4: memoize
expected_*aggregates inOnceCellafter the lastadd_portion— needs a "sealed" state onPageContext - Phase 3
duration_aware_maxforDeclassExemption(already noted in code)
Ready to merge.
Resolves 13 findings from the second-pass local review of PR #3 (3 HIGH, 5 MEDIUM, 4 LOW, 1 Observation). Zero correctness bugs in the previous state — these are API hardening, test coverage, and documentation fixes that were noted as "not blocking" but worth landing before Phase 3. HIGH - H-1 BatchError::is_panic() / is_cancelled() forwarding from JoinError, plus variant-aware Display. CI callers can now distinguish application panics from cancellation without downcasting through source(). - H-2 CLAUDE.md Architectural Invariants section pins the contracts enforced only by convention: __engine_promote engine-only, FixProposal purity, hardcoded Zone/position, PageContext page-reset, Severity::Off non-firing. - H-3 fix_with_threshold_rejects_infinity test pins INFINITY and NEG_INFINITY rejection against future refactors of the range+is_nan check. MEDIUM - M-1 build.rs assert_required helper panics at build time if any non-SAR CVE file produces zero entries. An empty CVEnumISMDissem.xml from a bad schema copy would otherwise silently make all dissem rules fire zero diagnostics. - M-2 resolve_idents uses HashMap<String, String> so duplicate- ident collisions name both offending CVE values in the panic. - M-3 TRIGRAPHS now emitted via BTreeSet (sorted, deduped); is_trigraph uses binary_search instead of O(n) linear scan. New trigraphs_are_sorted_and_unique test pins the invariant. - M-4 confidence_below_default_threshold_is_excluded and confidence_at_default_threshold_is_included tests pin the >= semantics at the 0.94 / 0.95 boundary. - M-5 zero_length_span_fix_is_filtered_before_sort test pins the !span.is_empty() filter that masks Phase 2 placeholder spans from the C-1 overlap guard. LOW - L-1 BatchOptions.max_concurrent_docs doc explains both roles it plays (ConcurrencyController semaphore AND buffer_unordered). - L-2 merge_user_into treats empty / whitespace-only user-field strings from .marque.local.toml as absent, so they can't silently overwrite populated values from another layer. - L-3 Severity doc rewritten to describe actual last-write-wins merge semantics instead of an aspirational .max() design that the loader never implemented. - L-4 config_supplied_threshold_filters_proposals test uses Config::set_confidence_threshold(0.5) to exercise the Config path (all other threshold tests use the fix_with_threshold override). Observation - O-1 expected_declassify_on carries an inline comment pinning why lex-compare is semantically correct under the documented "YYYY = Jan 1" encoding, with a warning that changing the convention to "YYYY = Dec 31" would silently break it. Tests: 66 unit + 1 doctest, zero failures (up from 60: +5 engine, +1 marque-ism). Clippy clean, rustfmt clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 3 (User Story 1, P1, MVP) lands the lint side of marque
end-to-end. Nine CAPCO rules (E001-E008, W001) emit byte-precise
diagnostics threaded through a new IsmAttributes::token_spans
sidecar. The marque check CLI is rewritten to match contracts/cli.md
verbatim, and 24 invalid + 20 valid corpus fixtures plus a 1010-line
prose precision corpus pin every rule against golden .expected.json.
Test count rises from 77 to 116 (+39), zero failures.
Architectural changes:
- Span and structure plumbing:
marque-ism gets TokenKind/TokenSpan { kind, span, text } and
IsmAttributes.token_spans: Box<[TokenSpan]>. The parser walks
separator positions via match_indices and records absolute byte
offsets for every recognized token, including TokenKind::Unknown
for blocks the parser cannot classify. Banner-form full-word
dissem controls (NOFORN, ORCON, IMCON) now parse via a new
parse_dissem_full_form fallback so the CVE-only abbreviations
(NF, OC, IMC) do not surface as Unknown.
- Scanner page breaks:
marque-ism gets MarkingType::PageBreak. The scanner emits a
zero-length candidate at every form-feed (\f) and at the third
consecutive newline of a \n\n\n+ run. \n\n (normal paragraph
break) does NOT trip the reset. The engine resets PageContext
BEFORE attempting to parse a PageBreak candidate so a corrupt
candidate cannot block the reset. Closes the engine.rs:80
TODO(phase-3) marker from the PR #3 review.
- RuleContext zone/position:
Both fields are now Option<Zone> / Option<DocumentPosition> and
the engine passes None until a structural scanner pass can prove
a value. The Phase-2 hardcoded Body was a silent lie to any
future rule that read the field; Phase 3 makes the uncertainty
type-system-enforced. Closes the engine.rs:103 TODO(phase-3)
marker from the PR #3 review.
Phase 3 rules (marque-capco/src/rules.rs, full rewrite):
- E001 banner-abbreviation: fires on abbreviated dissem in banners.
Span points at the literal abbreviation bytes via the parallel-
indexed DissemControl token spans.
- E002 missing-usa-trigraph: fires when REL TO is non-empty and
USA is missing or not first. Span points at the first trigraph.
- E003 misordered-blocks: walks token_spans, computes the CAPCO
ordinal per block kind (Class < SCI < SAR < Dissem/RelTo), fires
if any descending step appears. Span covers the whole marking.
Confidence 0.6 — kept as suggestion under default 0.95 threshold.
- E004 separator-count: detects back-to-back // separator runs by
walking adjacent Separator token spans whose end == next start.
Replaces the Phase-2 vec![] stub. Confidence 0.99.
- E005 declassify-in-banner: fires when a banner has a declass
exemption or date inline. Span points at the declass token.
- E006 deprecated-dissem: walks both DissemControl tokens (FOUO,
in modern CVE) and Unknown tokens (LIMDIS, removed from CVE).
Looks each up in find_migration; hits whose replacement is a
known dissem control fire E006. Skips abbreviation expansions
(NF to NOFORN) — those are owned by E001.
- E007 x-shorthand-date: walks Unknown tokens (the deprecated
dashed forms 25X1- / 50X1- land here because the CVE has only
the canonical 25X1 / 50X1-HUM). Migration table lookup via
TokenSpan.text avoids the false-positive of firing on the
canonical form.
- E008 unrecognized-token: walks Unknown tokens and skips ones
E007 will pick up (find_migration().is_some()). FR-012: no fix
offered.
- W001 deprecated-marking-warning: scaffold present, returns
vec![] in Phase 3 because the seed MIGRATIONS table has no
W001-flagged entries. Adding a flag to MigrationEntry in a
future build.rs change starts firing diagnostics with no other
code edits.
CLI rewrite (marque/src/main.rs + new marque/src/render.rs):
- Stdin sentinel: a path of `-` reads from stdin. No path given
also reads from stdin. Mixed paths and `-` allowed.
- --config <PATH>: short-circuits the upward walk and uses the
specified path's parent as the load root.
- --confidence-threshold <FLOAT>: per-call override for the fix
confidence gate.
- --format human|json: defaults to human for TTY stdout, json
otherwise. JSON is NDJSON conforming to contracts/diagnostic.json
with derive(Serialize) structs (no extra fields).
- --no-color: honors NO_COLOR env var (any non-empty value) and
TERM=dumb. Suppresses ANSI in human format.
- -q / --quiet: suppresses non-diagnostic stderr narration.
- --explain-config: dumps the merged Configuration as pretty JSON,
exits 0. Mutually exclusive with input paths and with the `fix`
subcommand. classifier_id_present is a boolean — the value
itself is NEVER emitted, even when MARQUE_CLASSIFIER_ID is set.
- Exit codes: 0 clean, 1 errors (Severity::Error or Severity::Fix),
2 warnings only, 64 EX_USAGE, 65 EX_DATAERR, 74 EX_IOERR.
Severity::Fix maps to exit 1 because a Fix-severity diagnostic
is still a violation that should block CI.
Corpus (24 invalid + 20 valid + 1 prose):
- 3 fixtures per rule for E001-E008 with hand-computed byte spans
validated by running each fixture through the integration test.
- W001 fixtures deferred per the plan (the rule cannot fire on
real corpus until a W001-flagged migration entry lands).
- 20 known-good fixtures spanning portion, banner, and CAB types.
- tests/corpus/prose/article.txt: 1010 lines of public-domain
Federalist No. 10 prose with 22 incidental (S) / (a) / (i)
mid-sentence parens. The lint is byte-clean against this corpus
— the SC-003a precision gate is satisfied.
Tests added (39 new):
- crates/marque-core/src/scanner.rs::tests — 4 page-break tests
- crates/marque-core/src/parser.rs::tests — 5 token-span tests
- crates/marque-engine/src/engine.rs::tests — multi-page document
- crates/marque-capco/src/rules.rs::tests — 13 per-rule tests
with byte-span assertions
- crates/marque-capco/tests/rules_us1.rs — corpus harness driving
every fixture through the parser+rules with .expected.json
match (drives parser/scanner directly to avoid a circular
marque-engine dev-dep)
- crates/marque-engine/tests/lint_pipeline.rs — 13 happy-path /
edge-case tests + 2 insta JSON snapshots pinning the
contracts/diagnostic.json shape against drift
- marque/tests/cli_smoke.rs — 9 CLI smoke tests via assert_cmd
covering stdin sentinel, --explain-config no-leak, NO_COLOR,
exit codes, and mutual exclusion
PR #3 review TODO(phase-3) markers closed:
- engine.rs:80 — PageContext page-break reset (Tasks 1, 2)
- engine.rs:103 — RuleContext.zone/position no longer hardcoded
- marque-capco/src/rules.rs Span::new(0, 0) placeholders — gone
- E004 stub vec![] — replaced with real implementation
- page_context.rs:50 Phase-3 wiring TODO — closed
CLAUDE.md "Architectural Invariants" section updated to reflect
both Phase-3 closures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfix(sar): address remaining PR review comments�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mThree follow-ups to copilot's partial fix in cfcce23:�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231m1. Full-form SAR compartments (Copilot review #1, partially unaddressed):�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231m Unifies abbrev and full-form program parsing via a single�[0m
�[38;5;238m 7�[0m �[38;5;238m│�[0m �[38;5;231m `parse_sar_program(chunk, base, indicator, spans)`. Full-form�[0m
�[38;5;238m 8�[0m �[38;5;238m│�[0m �[38;5;231m program identifier now allows uppercase ASCII letters and spaces�[0m
�[38;5;238m 9�[0m �[38;5;238m│�[0m �[38;5;231m only (no digits, no hyphens — the first `-` is still the�[0m
�[38;5;238m 10�[0m �[38;5;238m│�[0m �[38;5;231m program/compartment boundary). Grammar per §H.5 p100. Adds tests�[0m
�[38;5;238m 11�[0m �[38;5;238m│�[0m �[38;5;231m for `SPECIAL ACCESS REQUIRED-BUTTER POPCORN-J12 J54` and rejects�[0m
�[38;5;238m 12�[0m �[38;5;238m│�[0m �[38;5;231m digit-shaped nicknames like `SPECIAL ACCESS REQUIRED-123`. E030�[0m
�[38;5;238m 13�[0m �[38;5;238m│�[0m �[38;5;231m coalescing now preserves compartments when collapsing repeated�[0m
�[38;5;238m 14�[0m �[38;5;238m│�[0m �[38;5;231m full-form blocks.�[0m
�[38;5;238m 15�[0m �[38;5;238m│�[0m
�[38;5;238m 16�[0m �[38;5;238m│�[0m �[38;5;231m2. E030 fix span (Copilot review #2):�[0m
�[38;5;238m 17�[0m �[38;5;238m│�[0m �[38;5;231m Replace `tok.span.start.saturating_sub(2)` with a backward walk�[0m
�[38;5;238m 18�[0m �[38;5;238m│�[0m �[38;5;231m over `attrs.token_spans` to the nearest preceding Separator�[0m
�[38;5;238m 19�[0m �[38;5;238m│�[0m �[38;5;231m token. The parser trims per-block whitespace, so the Unknown�[0m
�[38;5;238m 20�[0m �[38;5;238m│�[0m �[38;5;231m token's span is not flush against the `//`; anchoring on the�[0m
�[38;5;238m 21�[0m �[38;5;238m│�[0m �[38;5;231m actual Separator span keeps the fix correct when source has�[0m
�[38;5;238m 22�[0m �[38;5;238m│�[0m �[38;5;231m whitespace after the separator (`SECRET// SAR-CD`). `original`�[0m
�[38;5;238m 23�[0m �[38;5;238m│�[0m �[38;5;231m is reconstructed from the separator text plus the gap so splicing�[0m
�[38;5;238m 24�[0m �[38;5;238m│�[0m �[38;5;231m remains byte-accurate.�[0m
�[38;5;238m 25�[0m �[38;5;238m│�[0m
�[38;5;238m 26�[0m �[38;5;238m│�[0m �[38;5;231m3. Dead code in parser (Copilot review #3):�[0m
�[38;5;238m 27�[0m �[38;5;238m│�[0m �[38;5;231m Remove `let _ = sar_captured;` — the variable is already read in�[0m
�[38;5;238m 28�[0m �[38;5;238m│�[0m �[38;5;231m the capture-gating branch; the silencer was stale.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(ism): add SCI structural data model (P1 of #3)�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mIntroduce SciMarking / SciControlSystem / SciCompartment and generate�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231mSciControlBare from the live CVE at build time. Adds sci_markings�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231malongside the existing sci_controls projection for back-compat.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(core): SCI structural subparser (P2 of #3)�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mAdds parse_sci_block dispatched before the existing CVE exact-match�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231mpath. Produces SciMarking entries for compartments and sub-compartments�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231mper CAPCO-2016 §A.6, while preserving the sci_controls enum projection�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231mfor back-compat with E010/E011 and existing token_spans consumers.�[0m
�[38;5;238m 7�[0m �[38;5;238m│�[0m
�[38;5;238m 8�[0m �[38;5;238m│�[0m �[38;5;231mThe dispatch is gated on `contains('-') || contains('/') ||�[0m
�[38;5;238m 9�[0m �[38;5;238m│�[0m �[38;5;231mis_bare_cve_value` so plain two-letter tokens (NF, RD, etc.) continue�[0m
�[38;5;238m 10�[0m �[38;5;238m│�[0m �[38;5;231mto fall through to the dissem/non-IC/SAR/AEA chain. Custom controls�[0m
�[38;5;238m 11�[0m �[38;5;238m│�[0m �[38;5;231madditionally reject any text that parses as another known category so�[0m
�[38;5;238m 12�[0m �[38;5;238m│�[0m �[38;5;231mmixed-category slash blocks like `SI/NF` still surface as Unknown for�[0m
�[38;5;238m 13�[0m �[38;5;238m│�[0m �[38;5;231mE004.�[0m
�[38;5;238m 14�[0m �[38;5;238m│�[0m
�[38;5;238m 15�[0m �[38;5;238m│�[0m �[38;5;231mcanonical_enum is populated (a) when the bare control itself matches a�[0m
�[38;5;238m 16�[0m �[38;5;238m│�[0m �[38;5;231mCVE value — preserves existing behaviour for `SI`, `TK`, `HCS` — and�[0m
�[38;5;238m 17�[0m �[38;5;238m│�[0m �[38;5;231m(b) when `{ctrl}-{first_comp}` matches a CVE value AND the first�[0m
�[38;5;238m 18�[0m �[38;5;238m│�[0m �[38;5;231mcompartment has no sub-compartments, per data-model §canonical_enum.�[0m
�[38;5;238m 19�[0m �[38;5;238m│�[0m
�[38;5;238m 20�[0m �[38;5;238m│�[0m �[38;5;231mAlso re-exports `marque_ism::is_bare_cve_value` from the generated CVE�[0m
�[38;5;238m 21�[0m �[38;5;238m│�[0m �[38;5;231mmodule so the subparser can gate on the live vocabulary rather than a�[0m
�[38;5;238m 22�[0m �[38;5;238m│�[0m �[38;5;231mhardcoded list.�[0m
�[38;5;238m 23�[0m �[38;5;238m│�[0m
�[38;5;238m 24�[0m �[38;5;238m│�[0m �[38;5;231mTests: 12 new parser tests covering bare, compound, multi-system,�[0m
�[38;5;238m 25�[0m �[38;5;238m│�[0m �[38;5;231msub-compartments, the §A.6 p16 canonical example, custom controls,�[0m
�[38;5;238m 26�[0m �[38;5;238m│�[0m �[38;5;231mrejection cases, and the SI/NF fall-through invariant. All 75 marque-�[0m
�[38;5;238m 27�[0m �[38;5;238m│�[0m �[38;5;231mcore tests + 94 marque-capco tests pass. Workspace test summary:�[0m
�[38;5;238m 28�[0m �[38;5;238m│�[0m �[38;5;231m411 passed, 1 failed (pre-existing SC-006 README issue in�[0m
�[38;5;238m 29�[0m �[38;5;238m│�[0m �[38;5;231mmarque-config, unrelated).�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(ism): SCI banner roll-up via PageContext (P4 of #3)�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mPageContext::expected_sci_markings() unions compartments and�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231msub-compartments across portions, sorted per CAPCO-2016 §A.6 p15�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231m(numeric first, alpha after). render_expected_banner() uses the�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231mstructural form for SCI when markings are present.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mfeat(capco): SCI compartment-aware rules (P3 of #3)�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mAdds E032 sci-system-order, E033 sci-compartment-order,�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231mE034 sci-custom-control-info, E035 sci-banner-rollup.�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231mAudits E010 bare-hcs to detect structural HCS-with-no-compartments.�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231mExtends E008 skip filter so structurally-formed SCI blocks are�[0m
�[38;5;238m 7�[0m �[38;5;238m│�[0m �[38;5;231mnot flagged as Unknown.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
CLAUDE.md SCI subsection + rule count bump (29 -> 33). README.md feature mention. marque-ism migration note.
…──────────────────────�[0m
�[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m 1�[0m �[38;5;238m│�[0m �[38;5;231mtest(corpus): SCI compartment fixtures (P5 of #3)�[0m
�[38;5;238m 2�[0m �[38;5;238m│�[0m
�[38;5;238m 3�[0m �[38;5;238m│�[0m �[38;5;231mCanonical §A.6 p16 example, bare single-sub, multi-sub,�[0m
�[38;5;238m 4�[0m �[38;5;238m│�[0m �[38;5;231mcustom-control, system-order violation, compartment-order�[0m
�[38;5;238m 5�[0m �[38;5;238m│�[0m �[38;5;231mviolation, and sub-compartment-order violation. Verifies�[0m
�[38;5;238m 6�[0m �[38;5;238m│�[0m �[38;5;231mstructural parser + E032/E033.�[0m
�[38;5;238m 7�[0m �[38;5;238m│�[0m
�[38;5;238m 8�[0m �[38;5;238m│�[0m �[38;5;231mValid fixtures (valid/):�[0m
�[38;5;238m 9�[0m �[38;5;238m│�[0m �[38;5;231m sci_bare_single_sub SECRET//SI-G ABCD//NOFORN�[0m
�[38;5;238m 10�[0m �[38;5;238m│�[0m �[38;5;231m sci_multi_subs SECRET//HCS-P INTEL OPS//NOFORN�[0m
�[38;5;238m 11�[0m �[38;5;238m│�[0m
�[38;5;238m 12�[0m �[38;5;238m│�[0m �[38;5;231mInvalid fixtures (invalid/):�[0m
�[38;5;238m 13�[0m �[38;5;238m│�[0m �[38;5;231m sci_canonical_subcompartments §A.6 p16 canonical (E034)�[0m
�[38;5;238m 14�[0m �[38;5;238m│�[0m �[38;5;231m sci_custom_control_info 99/SI custom control (E034)�[0m
�[38;5;238m 15�[0m �[38;5;238m│�[0m �[38;5;231m sci_system_order SI/123 (E032,E034)�[0m
�[38;5;238m 16�[0m �[38;5;238m│�[0m �[38;5;231m sci_compartment_order SI-NK-G (E033)�[0m
�[38;5;238m 17�[0m �[38;5;238m│�[0m �[38;5;231m sci_subcompartment_order SI-G DEFG ABCD (E033)�[0m
�[38;5;238m 18�[0m �[38;5;238m│�[0m
�[38;5;238m 19�[0m �[38;5;238m│�[0m �[38;5;231mHarness wiring:�[0m
�[38;5;238m 20�[0m �[38;5;238m│�[0m �[38;5;231m- marque-engine corpus_accuracy skips E034-expected diagnostics:�[0m
�[38;5;238m 21�[0m �[38;5;238m│�[0m �[38;5;231m E034 ships Severity::Off so the engine loop correctly does not�[0m
�[38;5;238m 22�[0m �[38;5;238m│�[0m �[38;5;231m fire it; rules_us1 exercises E034 by bypassing severity gating.�[0m
�[38;5;238m 23�[0m �[38;5;238m│�[0m �[38;5;231m- marque corpus_provenance treats structural SCI tokens�[0m
�[38;5;238m 24�[0m �[38;5;238m│�[0m �[38;5;231m (`SI-G ABCD`, `HCS-P INTEL OPS`) as in-vocabulary: they're�[0m
�[38;5;238m 25�[0m �[38;5;238m│�[0m �[38;5;231m parsed structurally rather than CVE-matched.�[0m
�[38;5;238m 26�[0m �[38;5;238m│�[0m
�[38;5;238m 27�[0m �[38;5;238m│�[0m �[38;5;231mDeviations / follow-up:�[0m
�[38;5;238m 28�[0m �[38;5;238m│�[0m �[38;5;231m- E035 banner-rollup fixture intentionally omitted. P4 delivered�[0m
�[38;5;238m 29�[0m �[38;5;238m│�[0m �[38;5;231m `PageContext::expected_sci_markings() -> Box<[SciMarking]>` but�[0m
�[38;5;238m 30�[0m �[38;5;238m│�[0m �[38;5;231m the rule's defensive fallback trait in marque-capco still�[0m
�[38;5;238m 31�[0m �[38;5;238m│�[0m �[38;5;231m declares `fn expected_sci_markings(&self) -> Vec<SciMarking>`,�[0m
�[38;5;238m 32�[0m �[38;5;238m│�[0m �[38;5;231m so the trait default (empty Vec) wins via method resolution.�[0m
�[38;5;238m 33�[0m �[38;5;238m│�[0m �[38;5;231m Result: E035 is inert in production. The author intended the�[0m
�[38;5;238m 34�[0m �[38;5;238m│�[0m �[38;5;231m method-resolution probe pattern to pick up the inherent method�[0m
�[38;5;238m 35�[0m �[38;5;238m│�[0m �[38;5;231m when P4 landed, but the return-type mismatch blocks that.�[0m
�[38;5;238m 36�[0m �[38;5;238m│�[0m �[38;5;231m Fix belongs in a separate commit (align the fallback signature�[0m
�[38;5;238m 37�[0m �[38;5;238m│�[0m �[38;5;231m to `Box<[SciMarking]>`); a corresponding invalid fixture should�[0m
�[38;5;238m 38�[0m �[38;5;238m│�[0m �[38;5;231m be added alongside that fix.�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
Six comments addressed: 1. `resolve_request_deadline` doc comment said it returned `Ok(Some)/Ok(None)`, but the function returns `Result<Duration, StatusCode>`. Rewritten to match the actual API contract and document each branch. 2. Empty header value treated as parse error (400). Spec/contract says empty should fall back to default. Now `s.trim().is_empty()` short-circuits to the per-endpoint default — same convention as absent. Catches the "template substitution emitted nothing" case where a caller's tooling produces `X-Marque-Deadline: ` without intending to set a budget. 3+4. Both `lint_handler` and `fix_handler` validated the deadline header AFTER `serde_json::from_slice`, so a request with a malformed body AND a bad header would surface as 422 instead of the intended 400. Reordered: corpus-override checks (header + query) → deadline header validation → body deserialization → body-field corpus-override check → engine call. The `Instant::now()` stamp stays just before the engine call so JSON parse time isn't billed against the caller's budget. 5. `deadline_header_above_cap_returns_400` claimed to cover the default cap (60 000 ms) but built its header value from `MAX_DEADLINE_CAP_MS` (600 000 ms — the hard ceiling, a different invariant). Switched to `DEFAULT_DEADLINE_CAP_MS + 1` so the boundary is pinned correctly. Comment updated to call out the cap vs. ceiling distinction. 6. Added three tests: `deadline_header_empty_uses_endpoint_default`, `deadline_header_whitespace_uses_endpoint_default`, and `deadline_header_validated_before_body_deserialization` (the ordering-invariant pin for #3/#4 — bad-header + bad-body must produce 400, not 422). Tests: 55/55 server pass (was 52, +3 new). Workspace 1242/1242 nextest pass; cargo fmt/clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three follow-up Copilot findings on the previous citation fixup: 1. `crates/capco/src/scheme.rs:4202` — the `FDR_DOMINATORS` header comment still called §B.3 Table 2 the authoritative enumeration AFTER my new maintenance note correctly cited §B.3.a p19. Updated the header to cite §B.3.a p19 (canonical) + §B.3 Table 2 pp 21-22 (derivative summary table) so the file no longer contradicts itself. 2. `crates/scheme/src/vocabulary.rs:343` — trait-level doc bullet #3 described `FDR_DOMINATORS` as the static slice `is_fdr_dominator` uses. After the maintenance contract reword, this is misleading. Rewrote bullet #3 to make the distinction explicit: `is_fdr_dissem` answers FD&R-set membership (admits RELIDO); `is_fdr_dominator` answers "dominator over RELIDO" for the conflict-family role (excludes RELIDO). Added the same "do not delegate" guidance the in-tree maintenance contract carries. 3. `crates/capco/src/vocabulary.rs:1024` — overstated `AnyInCategory` admission semantics. The override receives a single `TokenId` and routes through `capco_token_category` to hit category arms; adding a new `AnyInCategory(CAT_NEW)` to `FDR_DOMINATORS` only admits tokens that `capco_token_category` routes to `CAT_NEW`. Reworded with the explicit two-path dual-update rule (Token entries auto; AnyInCategory entries require a routing arm). All workspace tests pass; clippy clean. No behavioral changes.
…site (Copilot R2 suppressed #2 + inverse case) PR 4b-D.2 commit 13. Items 1, 2, 3 from Copilot R2. The D22 fix landed in PR 4b-D.2 commit 3 routed NOFORN FactAdd through DissemSet's token-axis supersession overlay. Copilot R2 surfaced two completion gaps the original commit missed: (a) Country-list axes stayed populated. `apply_fact_add` evicted `Rel` / `Relido` / `Displayonly` / `Eyes` TOKENS from `dissem_us` but did NOT clear the parallel `attrs.rel_to: Box<[CountryCode]>` / `attrs.display_only_to: Box<[CountryCode]>` country-list fields. The closure / PageRewrite paths cleared them via the `capco/noforn-clears-rel-to` / `capco/noforn-clears-display-only-to` rewrites; but direct `apply_intent` callers (E021 AEA → NOFORN, E038 NODIS/EXDIS → NOFORN) bypass `scheme.project`'s PageRewrite loop and got a marking with NOFORN + populated country lists — invalid per §H.8 p145. (b) Inverse case: FactAdd of `RELIDO` / `DISPLAY ONLY` / `EYES` onto a marking with `dissem_us = [Nf]` was APPENDING the dominated token. Result: `dissem_us = [Nf, Relido]` — same §H.8 p145 violation as (a), opposite direction. The existing double-NOFORN-insertion guard didn't fire because target ≠ Nf; the supersession check needed an explicit inverse-case branch. Fix in `crates/capco/src/scheme/actions/intent.rs::apply_fact_add` CAT_DISSEM branch: 1. NEW: after `with_noforn_injected` rebuilds `dissem_us`, unconditionally clear `attrs.rel_to` and `attrs.display_only_to` if they're non-empty. Brings the full §H.8 p145 supersession to the injection site, not just the token-axis half. 2. NEW: at the start of the dissem branch, if `target` is one of `Rel` / `Relido` / `Displayonly` / `Eyes` AND `attrs.dissem_us` already contains `Nf`, return `IntentInapplicable` — the caller's FactAdd is dominated by existing state. Matches the existing idempotency guard for double-NOFORN insertion (no mutation, audit log doesn't see an applied no-op). The PageRewrite layer (`capco/noforn-clears-*`) remains as defense-in-depth. With Item 1 in place, the PageRewrites are defensive-redundant on the production path; both layers converge to the same correct §H.8 p145 output. Authority: §H.8 p145 ("NOFORN ... Cannot be used with REL TO, RELIDO, EYES ONLY, or DISPLAY ONLY") + §D.2 Table 3 rows 1-2 + §H.8 p157 (EYES ONLY: NSA-only, retains `DissemControl::Eyes` through lint). Re-verified 2026-05-18 against `crates/capco/docs/CAPCO-2016.md`. ## Item 2 — fix the broken display_only_to integration test Copilot R2 #2 was right: the previous `noforn_clears_display_only_to_via_cross_portion_join` test didn't exercise the new `capco/noforn-clears-display-only-to` rewrite. `PageContext::expected_display_only` short-circuits to empty whenever ANY portion has NOFORN (page_context.rs:881-896); with portion 1 NOFORN-bearing, `out.display_only_to` was empty at join time BEFORE the rewrite ran. The test passed regardless of whether the rewrite existed. Renamed to `noforn_clears_display_only_to_via_ucni_promote` and pivoted the fixture to: - Portion 1: classified DOD UCNI + DISPLAY ONLY USA, GBR (the UCNI carries the §H.6 p116 strip-and-promote trigger; the DISPLAY ONLY ensures portion 1 passes the row-19 all-or-nothing gate in `expected_display_only`). - Portion 2: classified DISPLAY ONLY USA (also contributes display-permission so the gate doesn't short-circuit). Verification by toggle: - WITH `apply_fact_add` clearing AND the PageRewrite: test passes. - WITHOUT `apply_fact_add` clearing (Item 1 disabled), WITH the PageRewrite: test still passes — PageRewrite catches the violation. - WITHOUT the PageRewrite (Item 1 enabled, rewrite deleted): test passes — `apply_fact_add` catches it. - Hypothetical: WITHOUT BOTH: test fails. Both layers are load-bearing for their respective injection paths. The integration test now asserts the §H.8 p145 banner invariant holds through the full `scheme.project` pipeline regardless of which layer cleared. Defense-in-depth coverage documented inline. ## Item 3 — replace misleading example in noforn_clears.rs Copilot R2 #3 was right: `CLOSURE_NOFORN_SAR` cannot be the load-bearing trigger for the `noforn-clears-display-only-to` rewrite — DISPLAY ONLY is in `FDR_DOMINATORS` (closure.rs:95-109) which SUPPRESSES all 7 NOFORN-implies closure rules on any portion that carries DISPLAY ONLY. The realistic trigger is a Pattern-A/C PageRewrite (UCNI promote, NODIS-implies-NF) injecting NOFORN post-join. Rewrote both occurrences (line 65 + line 191 in noforn_clears.rs) to cite the Pattern-C UCNI scenario and explain why closure rules can't be the load-bearing path on this axis. Also documented the post-R2-#1 defense-in-depth framing: `apply_fact_add` is now self-sufficient; the rewrite covers future-refactor cases that bypass it. ## D22 amendment `decisions.md` D22 entry expanded to record the post-R2 completed scope: (a) token-axis eviction (PR 4b-D.2 commit 3); (b) country-axis clearing (Copilot R2 #1 commit 13); (c) inverse-case rejection (Copilot R2 #1 commit 13). All three axes now maintained at the apply_fact_add injection site, with the PageRewrite layer as defense-in-depth. ## Adjacency-walk findings Searched `FactAdd.*TOK_NOFORN|FactRef::Cve(TOK_NOFORN` across crates/ — production callers: - `crates/capco/src/rules_declarative.rs:1096` (E021 AEA add-NF) - `crates/capco/src/rules_declarative.rs:1504` (E038-style add-NF) - `crates/capco/src/scheme/rewrites/pattern_c.rs:219,269` (UCNI promotes) - `crates/capco/src/scheme/rewrites/pattern_a.rs:146,210,289,369` (nodis/exdis/sbu-nf/les-nf-implies-noforn) All 8 production NOFORN FactAdd sites now benefit from self-sufficient apply_fact_add clearing. The two E021/E038-style rules in rules_declarative.rs are the direct-caller paths Item 1's country-axis clearing specifically targets; the 6 Pattern-A/C rewrite paths go through `scheme.project`'s PageRewrite loop already, but Item 1 makes their apply_fact_add invocation self-sufficient too (defense-in-depth at the lower layer). Searched `FactAdd.*TOK_RELIDO|FactAdd.*TOK_DISPLAY_ONLY|FactAdd.*TOK_EYES` — ZERO production sites currently FactAdd dominated tokens directly. The inverse-case rejection is correctness-by-construction for future rule emissions. ## Verification gates cargo +stable fmt --all --check : clean cargo +stable build --workspace : clean cargo +stable clippy --workspace ... : clean cargo +stable test --workspace : 0 failures Specific test runs: - `category_action_intent::apply_fact_add_*` (6 new tests): all pass - `closure_hotpath::noforn_clears_display_only_to_*` (2 tests, rewrote one + kept one): all pass Tracking issue #528 (per-axis lattice Eq audit) remains the follow-up for the systematic `DissemSet`/`JointSet`/`SupersessionSet` review per PR #456's observational-state framing.
…dMarking) (#535) fix #1 — the doc-comment introduced in 2aad434 made an fix #4 — the test comment at projected::tests:: fixes #2, #3, and #6 — three independent prose-only updates Fix #2 — S007 top-level doc-block (`BareNatoRequiresRelToRule`), Fix #3 — `CapcoRuleSet::new()` registration comment. The Fix #6 — S005 retention annotation. The retention note on fix #5 — the migration trip-wire comments in
…, #524 Phase 3) Addresses Copilot's HIGH finding on PR #544: the non-FD&R suppressors in `RELIDO_US_CLASS_SUPPRESSORS` (CAT_DISSEM, CAT_NON_IC_DISSEM, CAT_AEA, CAT_SCI, CAT_SAR, CAT_NON_US_CLASSIFICATION, plus TOK_US_UNCLASSIFIED) violated the `MarkingScheme::closure` monotonicity contract (`crates/scheme/src/scheme.rs:476-483`). Concrete violation: - `x = (S)`: rule fires → `Cl(x).dissem_us = {RELIDO}` - `y = (S, FOUO)`: CAT_DISSEM matches → rule suppressed → `Cl(y).dissem_us = {FOUO}` (FOUO ⊄ supersession-related to RELIDO) - `{RELIDO} ⊄ {FOUO}` ⇒ `Cl(x) ⊑ Cl(y)` violated. ## Redesign - **New sentinel `TOK_US_COLLATERAL_CLASSIFIED` (TokenId 157, replacing the dropped `TOK_US_UNCLASSIFIED`)**: fires on US classification at or above Restricted level (`attrs.us_classification().is_some_and(|l| l != Unclassified)`). Upward-closed in the lattice order ⇒ adding facts to a collateral-classified marking can't make this predicate stop firing. Replaces the `TOK_US_CLASSIFIED` trigger; gates the §H.8 p154 Unclassified carve-out at the trigger level instead of via a suppressor. - **`RELIDO_US_CLASS_SUPPRESSORS` reduced to monotone-only entries**: FD&R dominators (NOFORN / RELIDO / DISPLAY_ONLY / CAT_REL_TO / EYES — §D.2 Table 3 pp.28-30 + §H.8 p145) plus the 6 per-compartment SCI sentinels (TOK_SI_G / TOK_HCS_O / TOK_HCS_P_SUB / TOK_TK_BLFH / TOK_TK_IDIT / TOK_TK_KAND). Each entry satisfies the §4.7.3 case-2 monotonicity property either directly (FD&R dominators supersede RELIDO via the precedence overlay) or by closure composition (SCI compartments add NOFORN via per-marking row → CAVEATED → §H.8 p145 supersession, so the composite Cl(y) contains NOFORN ⊐ RELIDO). - **"No other dissem" via composition**, not anti-monotone suppressors: any caveat marker triggers `CLOSURE_NOFORN_CAVEATED` (Trio 1) which injects NOFORN; NOFORN then supersedes the RELIDO cone via `with_noforn_injected`. FOUO (which is NOT in CAVEATED's trigger list) is handled by the §H.8 p134 `classification-evicts-fouo` PageRewrite — FOUO is stripped in classified context, so the closure's bare-RELIDO output is the correct final state per CAPCO after rewrites compose. ## Citation corrections (Copilot #2, #3) - `wasm/tests/native_parity.rs::compute_banner_single_secret_portion` comment: added §B.3 Table 2 p21 alongside §H.8 p154 (Copilot #2). - `lattice_vs_scheme_parity.rs` hoisted-rationale source 2: now cites §B.3 Table 2 p21 as primary authority with §H.8 p154 as grammar reference (Copilot #3). ## Suppressed comment (Copilot #4) The stale `CLOSURE_RELIDO_FOUO` reference in `us_class_suppression_fixture`'s CAT_DISSEM arm — flagged as a low-confidence suppressed Copilot comment — was removed alongside the CAT_DISSEM suppressor itself in this redesign. ## New monotonicity sanity pins Two new tests in `phase3_closure_pin` cover the specific scenarios Copilot flagged: - `us_class_monotone_under_added_fouo` — `x = (S) ⊑ y = (S, FOUO)`; asserts Cl(x) and Cl(y) both contain RELIDO (subset monotonicity). - `us_class_monotone_under_added_si_g` — `x = (S) ⊑ y = (S, SI-G)`; asserts Cl(y) contains NOFORN (via composition) and NOT RELIDO (NOFORN supersedes RELIDO via §H.8 p145; lattice-order monotonicity). The marque-scheme `proptest_closure` harness covers the general synthetic-scheme monotonicity property (passes for this redesign). ## Test fixture updates - `us_class_suppression_fixture` reduced to the 11-entry suppressor matrix (5 FD&R dominators + 6 SCI per-compartment). All non-FD&R category arms (CAT_DISSEM, CAT_NON_IC_DISSEM, CAT_AEA, CAT_SCI, CAT_SAR, CAT_NON_US_CLASSIFICATION) + TOK_US_UNCLASSIFIED arm removed. - Source-of-truth pin's documented carve-out list reduced to just TOK_RELIDO (the cone-itself case); CAT_NON_US_CLASSIFICATION carve-out gone (no longer in slice). - Conflict-variant pin doc-comment updated to reference TOK_US_COLLATERAL_CLASSIFIED, not TOK_US_CLASSIFIED. - Module-level docstring updated to remove TOK_US_UNCLASSIFIED references and add §D.2 Table 3 pp.28-30 citation.
…s + baseline correction) Copilot R1 ran on commit 32c33bb (the original R1 fix-ups) and surfaced 5 substantive findings, plus on verification a 6th issue emerged — the original R1 commit's "pre-4b baseline = 15" correction was itself wrong by walked-adjacency-omission. All 6 issues resolved. Ground truth confirmed by walking `git show 5fd5a33:crates/capco/src/scheme/rewrites/*.rs` (the first commit that split rewrites out of monolithic scheme.rs) + counting PageRewrite-row literals in each file at HEAD. 1. **Pin test (Constraint::Custom dedup masking, Copilot R1 #1)** — The test at `post_pr_4b_declares_exact_39_custom_constraints` collected into a BTreeSet before counting; a duplicate `Constraint::Custom("capco/foo", ...)` row would dedupe in the set and the size-only assertion would still pass. Fix: collect into a Vec first, then compare `raw_count == set.len()` — triple-pin (raw count + set size + raw_count_equals_set_size). The equality assertion is the load-bearing dedup check. 2. **CI hollow-coverage (Copilot R1 #2)** — `corpus_parity.rs` is currently `#![cfg(any())]` (disabled by PR 3c.B Commit 10 pending FixProposal-shape rewrite). The `pr-4b-corpus-regression` job mirroring PR 3b's T029 inherited the same dormancy hole. Fix: add three explicit `cargo test --test` invocations for `post_4b_lattice_inventory_pin`, `lattice_static_assertions`, and `post_3b_registration_pin`. With these four active pins plus the still-ACTIVE `corpus_accuracy` and `corpus_provenance` suites, the job has real 4b-specific coverage even while corpus_parity stays dormant. Inline comments clarify which suites are ACTIVE vs dormant so a future reader who sees the `corpus_parity` invocation passing doesn't conclude it gives real coverage. 3. **Attestation `DeclassExemptionAccumulator` citation (Copilot R1 #3)** — Walked-adjacency miss from the R1 fix-up on `DeclassifyOnLattice`. Same root cause: §H.6 grounds the AEA *exception*, not the general declassification-exemption hierarchy rule. Fix: §H.6 → §E.3 pp32-33 (Multiple Sources) + §E.1 p31 (exemption catalog 25X#/50X#/75X#), matching the doc-comment §-authority block at the type definition. 4. **Attestation `Pre-4b baseline` cell (Copilot R1 #4)** — The table at "Per-axis net-delta math" had "~14 (pre-Pattern-B/C)" with the tilde flagging an unverified approximate. Fix: exact baseline = 14 (4 pattern_a + 2 noforn_clears + 8 transmutation_stubs). The 3rd noforn_clears row (`noforn-clears-display-only-to`) was added by PR 4b-D.2 (#527), not pre-4b. 5. **Pin doc-comment derivation chain (Copilot R1 #5)** — The header doc-comment said "Pattern C 8 rows" in the 4b-C delta, but 4b-C landed only 7 Pattern-C rows; the 8th (`sbu-nf-evicted-by-classified`) was added later by #541 in the 4b-F window. Fix: change "Pattern C 8 rows" → "Pattern C 7 rows" in the 4b-C section; add the #541 +1 entry separately in the 4b-F section; document the 4b-D.2 +1 PageRewrite contribution that the original doc-comment also omitted. 6. **R1 baseline correction (walked-adjacency from #4)** — The original R1 fix-up commit (32c33bb) changed `CLAUDE.md:283` from "14 → 27" to "15 → 27" based on the architect plan's "15 = 4 pattern_a + 3 noforn_clears + 8 transmutation_stubs" claim. The architect's count of 3 noforn_clears rows was off-by-one: pre-4b had only 2 (`noforn-clears-rel-to` + `noforn-clears-fdr-family`); the 3rd row (`noforn-clears-display-only-to`) was added by 4b-D.2 (#527). Walked the git history at `5fd5a339` (Stage 2 PR A sub-split of `scheme/rewrites/`) and counted the literal rows: pattern_a 4 / noforn_clears 2 / transmutation_stubs 8 = 14 pre-4b. The original 4b-C landing CLAUDE.md entry's "catalog row count 14 → 23" is ground truth; my R1 fix-up over-corrected to 15 → 27 and needs the inverse correction back to 14 → 27 with the full per-sub-PR breakdown: Pre-4b: 14 (4 pattern_a + 2 noforn_clears + 8 transmutation_stubs) 4b-C +9: 23 (Pattern-B 2 + Pattern-C 7) 4b-D.2 +1: 24 (noforn-clears-display-only-to per §H.8 p145) #541 +1: 25 (sbu-nf-evicted-by-classified Pattern-C row 2b per §H.9 p178) #552 +1: 26 (sbu-nf-supersedes-sbu per §H.9 p178) #555 +1: 27 (les-nf-supersedes-les per §H.9 p185) 13 new rows over the 4b umbrella + post-4b-F window. 14 + 13 = 27 ✓ (matches the test's hardcoded expected count). Verification gate (all GREEN): - `cargo +stable test -p marque-capco --test post_4b_lattice_inventory_pin` — 3/3 PASS (including new raw_count_equals_set_size assertion) - `cargo +stable test -p marque-capco --test lattice_static_assertions` — 0/0 (compile-time only, file builds clean) - `cargo +stable test -p marque-capco --test post_3b_registration_pin` — 1/1 PASS (38 unchanged) - `cargo +stable clippy --workspace --all-targets -- -D warnings` — 0 warnings - `cargo +stable fmt --check` — clean - `cargo run --manifest-path tools/citation-lint/Cargo.toml --release -- .` — 0 defects Per project memory `feedback_audit_predicates_against_source` (Constitution VIII propagation discipline) and `feedback_double_check_the_plan` (when the architect plan and ground truth conflict, ground truth wins; the architect plan was a preflight draft, the merged code is authoritative). Per project memory `feedback_walked_adjacencies` (every fix walks the related code paths) — the DeclassExemptionAccumulator fix #3 is a textbook walked-adjacency from the R1 DeclassifyOnLattice fix that I missed the first time; the baseline correction #6 is a walked-adjacency from the architect plan's count that I trusted instead of verifying.
Copilot R2 ran on commit a8a7065 (the R2 fix-ups rebased on staging-merge) and surfaced 4 valid findings. All addressed: 1. **lattice_static_assertions.rs — bounded-shape negative locks missing.** The doc-comment header said only ClassificationLattice and NatoClassLattice are bounded, but the test file only asserted the POSITIVE bounded impls. The other 10 lattice types had no explicit lock against accidental `BoundedJoinSemilattice` / `BoundedMeetSemilattice` addition. Asymmetric with the Join-only negative locks below (DissemSet / JointSet / DisplayOnlyBlock). Fix: add 10 × `assert_not_impl_any!(<T>: BoundedJoinSemilattice, BoundedMeetSemilattice)` blocks closing the symmetry. Now an accidental `top()` impl on SciSet / SarSet / FgiSet / AeaSet / etc. is a build error. 2. **CLAUDE.md:283 — absolute-vs-delta confusion in per-axis net-delta math.** The line read `9 → 12 Join impls / 6 → 9 Meet impls` as if those were start → end counts. But the attestation table establishes pre-4b Join baseline = 3 (Sci/Sar/Fgi only) and terminal = 12, so the start → end is `3 → 12` Join / `3 → 9` Meet. The `9 → 12 / 6 → 9` form was conflating the +9 / +6 deltas with absolute counts. Fix: rewrite as `3 → 12 Join impls (pre-4b: Sci/Sar/Fgi; +AeaSet 4b-A; +7 in 4b-B; +DisplayOnlyBlock 4b-E) / 3 → 9 Meet impls (pre-4b: Sci/Sar/Fgi; +AeaSet 4b-A; +5 in 4b-B with Joint/Dissem Join-only per PR #456)`. The +9 and +6 deltas are still derivable from the inline per-sub-PR contributions but no longer presented as start values. 3. **pm-decisions.md:101 — OQ-RUST-4 still says T134, not T146.** Walked-adjacency miss within the same file: the ERRATA block at the top of pm-decisions correctly notes T130-T134 are taken by PR-8/9 and the closeout range is T142-T146 with T146 reserved for the SupersessionSet deferral, but the OQ-RUST-4 resolution §1 still referenced T134. Fix: T134 → T146 with a back-pointer to the ERRATA so a future reader who lands at OQ-RUST-4 first sees both the corrected ID and the rationale. 4. **PR description body — stale "14 → 15" claim.** The R1 fix-up commit message and the PR description's "R1 fix-ups landed in 32c33bb" section claimed I'd corrected the pre-4b PageRewrite baseline 14 → 15. The R2 fix-up commit (b6dce2fe, now a8a7065 post-rebase) reverted that to 14 with documented ground-truth verification from `5fd5a339`. The PR description body is GitHub-side, not in-tree, so it didn't pick up the R2 correction. Fix: update the PR body via `gh pr edit` (separate from this in-tree commit) to remove the stale claim. Verification gate (all GREEN): - `cargo +stable build --tests -p marque-capco` — 10 new `assert_not_impl_any!` blocks compile clean (any future accidental `BoundedJoin*` impl on the locked types becomes a build error) - `cargo +stable test -p marque-capco --test lattice_static_assertions` — 0/0 (compile-time only) - `cargo +stable test -p marque-capco --test post_4b_lattice_inventory_pin` — 3/3 PASS - `cargo +stable test -p marque-capco --test post_3b_registration_pin` — 1/1 PASS - `cargo +stable clippy --workspace --all-targets -- -D warnings` — 0 warnings - `cargo +stable fmt --check` — clean - `cargo run --manifest-path tools/citation-lint/Cargo.toml --release -- .` — 0 defects Per project memory `feedback_walked_adjacencies`: the T134→T146 fix (#3) is the same walked-adjacency-within-a-file pattern that bit me on the DeclassExemptionAccumulator citation in R1; same root cause (I updated some references to T142-T146 but missed OQ-RUST-4). Per memory `feedback_audit_predicates_against_source` (Constitution VIII propagation): the CLAUDE.md absolute-vs-delta fix (#2) is the kind of summary-stats drift that accumulates across closeout iterations when the document is treated as a chronicle instead of a current state snapshot.
…ings) Addresses CI Format+Lint failure plus 5 Copilot inline review findings on PR #627. Trade-offs: * CI Format+Lint failure (stable rustfmt) — `cargo fmt --all` applied; affects `crates/scheme/src/render_context.rs` test module's const declarations (stable rustfmt expanded multi-line; local nightly rustfmt rendered single-line). CI gate now green. * Copilot finding #1 (type-safety hole) — `SectionRef::with_sub_subsection` allowed constructing `subsection: None + sub_subsection: Some(N)` which Display would emit as misleading `§H.N pNN` form. Architect F-5 had already flagged `sub_subsection` as dead capability (CAPCO-2016 has zero `§X.Y.Z`-style citations in the manual; the original scaffolding was speculative grow-path under YAGNI). Code-reviewer F-3 flagged the synthetic test name implying a real citation. Resolution: retired `sub_subsection` entirely — field + builder + Display branch + `_SAR_SUB_SUB` const + `display_subsection_plus_ sub_subsection_h5_4_p99` test + scanner regex sub_subsection branch + proptest generator branch. Closes Copilot #1, architect F-5, code-reviewer F-3 by elimination. If a future authoritative source introduces 3-level subsections, the field re-extends additively via `#[non_exhaustive]`. * Copilot finding #2 (Display vs citation-lint regex divergence) — module-level doc-comment in `citation.rs` claimed Display emits the exact `tools/citation-lint` parser shape, but the lint's parser at `tools/citation-lint/src/citation.rs::Citation` carries only `{ section, subsection, pages, raw }` — no structural Table field. Softened the doc comment to acknowledge the `Table <N>` structural- parsing extension lands alongside 3c.2.C's `&'static str → Citation` migration; the lint today tolerates `Table` as an occurrence anchor without parsing it into its own field. C-FOLLOWUP-1 in the deferred- findings doc already tracks this work. * Copilot finding #3 (test scanner doc divergence) — same gap as #2, in `citation_display_roundtrip.rs` module doc; updated. * Copilot finding #4 (PM doc commit count) — PM doc §2 said "Five logical commits" listing A1–A5, but PR has A6 (reviewer-pass cleanup) and now A7 (this commit). Updated to "Seven commits land in PR 3c.2.A; A1–A5 are the logical scaffolding commits, A6 is the reviewer-pass cleanup, A7 is the Copilot-review cleanup." with full A6/A7 rows in the commit table. * Copilot finding #5 (PM doc 26 → 23 incomplete erratum, also the Copilot-suppressed low-confidence comment) — PM-9 erratum at A6 landed but missed R-A3 (§3 risk register), the A4 commit table row (§2), and the reviewer attestation checklist (§4). Replaced all remaining "26" references with "23" using replace_all. Deferred-findings doc updated: GH-FOLLOWUP-2 and GH-FOLLOWUP-3 both marked CLOSED IN A7 (resolved by removal). Addressed-4, Addressed-5, Addressed-6 sections added for the A7-resolved items. Gates: - `cargo fmt --all -- --check` clean (the CI gate that failed) - `cargo check --workspace` clean - `cargo nextest run --workspace` — 2635 passed, 11 skipped (net -1 test from the retired `display_subsection_plus_sub_subsection_h5_4_p99` fixture; expected and intentional) - `cargo clippy --workspace --tests` — clean for touched crates Ref: PR #627 Copilot review findings (5 inline + 1 suppressed low-confidence)
…an guidance (PR #681) Three Copilot round-2 comments addressed, all on doc-comment / inline- comment accuracy (no logic changes): 1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`). Doc-comment claimed `Some(span)` once a Banner candidate "has been processed on the page". Accurate post-correction: the engine only records `page_banner_span` AFTER the decoder-confidence gate (per issue #471 — sub-threshold decoder banner recognitions `continue` before the capture site). Rewording: "once a Banner candidate has cleared the engine's decoder confidence gate (`prov.recognition_score() >= self.config.confidence_threshold()`) and been processed; a sub-threshold decoder banner recognition does NOT populate this field — the same discipline that gates downstream rule dispatch and `PageContext` accumulation per issue #471." Validates the capture-site discipline rust-reviewer Q4 verified ("the banner-span capture sits AFTER the decoder confidence gate; `page_portions.push` applies the same discipline — consistent"). 2. **Sub-span discipline — achievable guidance** (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the source bytes themselves" to target sub-blocks, but `Rule::check(& self, &CanonicalAttrs, &RuleContext)` does NOT receive the source byte buffer — consumers literally cannot read `&[u8]` slices from a `Span` they're handed. Rewording: rules emit a `FixIntent` keyed off the FULL banner span and let `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical` re-render the whole banner from the page-level lattice projection (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does NOT receive the source byte buffer — the rule cannot read `&[u8]` slices itself from the span." 3. **Test outer comment — newline-terminator boundary** (`crates/engine/src/engine.rs:6341`). The outer comment above the `let src: &[u8] = ...` line still said "ends at the trailing newline" — accurate AS POSITIONAL ANCHOR but misleading on the `Span` semantics, since the assertion two blocks down asserts `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording: "ends at the newline-terminator boundary (excluding the `\n` itself — the scanner's banner span ends one-past the last printable byte, not one-past the terminator)." The DIAGRAMMED inner comment (the source-layout diagram inside the assertion block) was already accurate post-Copilot-round-1 fix; this is the OUTER comment that pre-dated the diagram and was never updated to match. Walk-adjacent gap closed. ## Test plan - [x] All 5 `page_banner_span_*` tests pass - [x] `cargo clippy --workspace --all-targets -- -D warnings` clean - [x] `cargo fmt --check` clean - [x] All 3 Copilot round-2 inline comments addressed No code changes — all three were doc-comment / inline-comment accuracy fixes. The capture-site discipline (#1), sub-span emission path (#2), and span-endpoint semantics (#3) all match the actual engine + rule API today; the docs now match too. Refs: PR #681 Copilot round 2 comments.
…an guidance (PR #681) Three Copilot round-2 comments addressed, all on doc-comment / inline- comment accuracy (no logic changes): 1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`). Doc-comment claimed `Some(span)` once a Banner candidate "has been processed on the page". Accurate post-correction: the engine only records `page_banner_span` AFTER the decoder-confidence gate (per issue #471 — sub-threshold decoder banner recognitions `continue` before the capture site). Rewording: "once a Banner candidate has cleared the engine's decoder confidence gate (`prov.recognition_score() >= self.config.confidence_threshold()`) and been processed; a sub-threshold decoder banner recognition does NOT populate this field — the same discipline that gates downstream rule dispatch and `PageContext` accumulation per issue #471." Validates the capture-site discipline rust-reviewer Q4 verified ("the banner-span capture sits AFTER the decoder confidence gate; `page_portions.push` applies the same discipline — consistent"). 2. **Sub-span discipline — achievable guidance** (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the source bytes themselves" to target sub-blocks, but `Rule::check(& self, &CanonicalAttrs, &RuleContext)` does NOT receive the source byte buffer — consumers literally cannot read `&[u8]` slices from a `Span` they're handed. Rewording: rules emit a `FixIntent` keyed off the FULL banner span and let `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical` re-render the whole banner from the page-level lattice projection (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does NOT receive the source byte buffer — the rule cannot read `&[u8]` slices itself from the span." 3. **Test outer comment — newline-terminator boundary** (`crates/engine/src/engine.rs:6341`). The outer comment above the `let src: &[u8] = ...` line still said "ends at the trailing newline" — accurate AS POSITIONAL ANCHOR but misleading on the `Span` semantics, since the assertion two blocks down asserts `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording: "ends at the newline-terminator boundary (excluding the `\n` itself — the scanner's banner span ends one-past the last printable byte, not one-past the terminator)." The DIAGRAMMED inner comment (the source-layout diagram inside the assertion block) was already accurate post-Copilot-round-1 fix; this is the OUTER comment that pre-dated the diagram and was never updated to match. Walk-adjacent gap closed. ## Test plan - [x] All 5 `page_banner_span_*` tests pass - [x] `cargo clippy --workspace --all-targets -- -D warnings` clean - [x] `cargo fmt --check` clean - [x] All 3 Copilot round-2 inline comments addressed No code changes — all three were doc-comment / inline-comment accuracy fixes. The capture-site discipline (#1), sub-span emission path (#2), and span-endpoint semantics (#3) all match the actual engine + rule API today; the docs now match too. Refs: PR #681 Copilot round 2 comments.
…an guidance (PR #681) Three Copilot round-2 comments addressed, all on doc-comment / inline- comment accuracy (no logic changes): 1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`). Doc-comment claimed `Some(span)` once a Banner candidate "has been processed on the page". Accurate post-correction: the engine only records `page_banner_span` AFTER the decoder-confidence gate (per issue #471 — sub-threshold decoder banner recognitions `continue` before the capture site). Rewording: "once a Banner candidate has cleared the engine's decoder confidence gate (`prov.recognition_score() >= self.config.confidence_threshold()`) and been processed; a sub-threshold decoder banner recognition does NOT populate this field — the same discipline that gates downstream rule dispatch and `PageContext` accumulation per issue #471." Validates the capture-site discipline rust-reviewer Q4 verified ("the banner-span capture sits AFTER the decoder confidence gate; `page_portions.push` applies the same discipline — consistent"). 2. **Sub-span discipline — achievable guidance** (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the source bytes themselves" to target sub-blocks, but `Rule::check(& self, &CanonicalAttrs, &RuleContext)` does NOT receive the source byte buffer — consumers literally cannot read `&[u8]` slices from a `Span` they're handed. Rewording: rules emit a `FixIntent` keyed off the FULL banner span and let `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical` re-render the whole banner from the page-level lattice projection (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does NOT receive the source byte buffer — the rule cannot read `&[u8]` slices itself from the span." 3. **Test outer comment — newline-terminator boundary** (`crates/engine/src/engine.rs:6341`). The outer comment above the `let src: &[u8] = ...` line still said "ends at the trailing newline" — accurate AS POSITIONAL ANCHOR but misleading on the `Span` semantics, since the assertion two blocks down asserts `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording: "ends at the newline-terminator boundary (excluding the `\n` itself — the scanner's banner span ends one-past the last printable byte, not one-past the terminator)." The DIAGRAMMED inner comment (the source-layout diagram inside the assertion block) was already accurate post-Copilot-round-1 fix; this is the OUTER comment that pre-dated the diagram and was never updated to match. Walk-adjacent gap closed. ## Test plan - [x] All 5 `page_banner_span_*` tests pass - [x] `cargo clippy --workspace --all-targets -- -D warnings` clean - [x] `cargo fmt --check` clean - [x] All 3 Copilot round-2 inline comments addressed No code changes — all three were doc-comment / inline-comment accuracy fixes. The capture-site discipline (#1), sub-span emission path (#2), and span-endpoint semantics (#3) all match the actual engine + rule API today; the docs now match too. Refs: PR #681 Copilot round 2 comments.
…an guidance (PR #681) Three Copilot round-2 comments addressed, all on doc-comment / inline- comment accuracy (no logic changes): 1. **Field doc — confidence-gate semantics** (`crates/rules/src/lib.rs:526`). Doc-comment claimed `Some(span)` once a Banner candidate "has been processed on the page". Accurate post-correction: the engine only records `page_banner_span` AFTER the decoder-confidence gate (per issue #471 — sub-threshold decoder banner recognitions `continue` before the capture site). Rewording: "once a Banner candidate has cleared the engine's decoder confidence gate (`prov.recognition_score() >= self.config.confidence_threshold()`) and been processed; a sub-threshold decoder banner recognition does NOT populate this field — the same discipline that gates downstream rule dispatch and `PageContext` accumulation per issue #471." Validates the capture-site discipline rust-reviewer Q4 verified ("the banner-span capture sits AFTER the decoder confidence gate; `page_portions.push` applies the same discipline — consistent"). 2. **Sub-span discipline — achievable guidance** (`crates/rules/src/lib.rs:554`). Doc said rules can "parse the source bytes themselves" to target sub-blocks, but `Rule::check(& self, &CanonicalAttrs, &RuleContext)` does NOT receive the source byte buffer — consumers literally cannot read `&[u8]` slices from a `Span` they're handed. Rewording: rules emit a `FixIntent` keyed off the FULL banner span and let `MarkingScheme::apply_intent` + `MarkingScheme::render_canonical` re-render the whole banner from the page-level lattice projection (`ctx.page_marking`). Explicit note added: "[`Rule::check`] does NOT receive the source byte buffer — the rule cannot read `&[u8]` slices itself from the span." 3. **Test outer comment — newline-terminator boundary** (`crates/engine/src/engine.rs:6341`). The outer comment above the `let src: &[u8] = ...` line still said "ends at the trailing newline" — accurate AS POSITIONAL ANCHOR but misleading on the `Span` semantics, since the assertion two blocks down asserts `end = 37` (the `\n`'s INDEX, NOT one-past it). Rewording: "ends at the newline-terminator boundary (excluding the `\n` itself — the scanner's banner span ends one-past the last printable byte, not one-past the terminator)." The DIAGRAMMED inner comment (the source-layout diagram inside the assertion block) was already accurate post-Copilot-round-1 fix; this is the OUTER comment that pre-dated the diagram and was never updated to match. Walk-adjacent gap closed. ## Test plan - [x] All 5 `page_banner_span_*` tests pass - [x] `cargo clippy --workspace --all-targets -- -D warnings` clean - [x] `cargo fmt --check` clean - [x] All 3 Copilot round-2 inline comments addressed No code changes — all three were doc-comment / inline-comment accuracy fixes. The capture-site discipline (#1), sub-span emission path (#2), and span-endpoint semantics (#3) all match the actual engine + rule API today; the docs now match too. Refs: PR #681 Copilot round 2 comments.
…icate-ID form
Per T044 OD-8.A (PM decision 2026-05-22): the engine constraint-bridge
becomes a no-op pass-through — the catalog row's `name` IS the
canonical predicate ID, with no string translation. This commit
performs the row-name rename in the catalogs and the supporting
dispatch + emit-helper updates inside `marque-capco`.
## Class-floor catalog (27 rows; legacy-rule-id-map §3)
`crates/capco/src/scheme/class_floor.rs` + the
`crates/capco/src/scheme/constraints/class_floor_catalog.rs` mirror
both rename from the slash-form `class-floor/<marking>` /
`E058/<purpose>` to the canonical
`banner.<axis>.<floor|ceiling>-<marking>` form. Five discriminators
appear in the new catalog:
- `banner.classification.floor-*` (12 rows; SCI / SAR / NATO-SAP +
passthrough rows).
- `banner.aea.floor-*` (7 rows; RD / FRD / TFNI / ATOMAL / CNWDI /
RD-SG / FRD-SG).
- `banner.aea.ceiling-*` (2 rows; DOD-UCNI / DOE-UCNI).
- `banner.dissem.floor-*` (4 rows; RSEN / IMCON / ORCON / EYES-ONLY).
- `banner.classification.floor-passthrough-*` (4 rows; BUR / HCS-X /
KLM / MVL).
`is_class_floor_catalog_name` (in `predicates/class_floor.rs`) and
the bridge gate in `adapter.rs::message_by_name` both move from
prefix-check (`starts_with("class-floor/") || starts_with("E058/")`)
to substring-check (`contains(".floor-") || contains(".ceiling-")`).
`floor` and `ceiling` are reserved by convention for the class-floor
catalog; no other rule predicate uses either substring.
## SCI per-system catalog (5 rows; legacy-rule-id-map §4)
`crates/capco/src/scheme/sci_per_system.rs` + the mirror at
`crates/capco/src/scheme/constraints/sci_per_system_catalog.rs`
both rename from `sci-per-system/<purpose>` to the canonical
`marking.sci.<predicate>` form. The 5 rows cover:
- `marking.sci.hcs-o-companions` (§H.4 p64)
- `marking.sci.hcs-p-noforn-required` (§H.4 p66)
- `marking.sci.hcs-p-sub-companions` (§H.4 p68)
- `marking.sci.si-g-companions` (§H.4 p80)
- `marking.sci.tk-compartment-noforn-required` (§H.4 p87 + p91 + p95)
`is_sci_per_system_catalog_name` keeps a `starts_with("marking.sci.")`
prefix-check (the `marking.sci.` prefix is uniquely scoped to the
per-system catalog; `portion.sci.*` is reserved for standalone SCI
rules like E061/E062/E063/W034/E065).
## `RULE_E059` constant deletion (legacy-rule-id-map §6 + plan §2.6)
`pub(crate) const RULE_E059: RuleId = RuleId::new("E059")` in
`sci_per_system.rs` deleted. The walker-shared constant existed
because every per-system emit body fabricated the same E059 rule ID;
post-T044 each row carries its own predicate ID via `row.name`. The
9 `RULE_E059` reference sites in `scheme/actions/companions.rs`
(rows #1, #3, #4 inside `emit_hcs_o_companions` /
`emit_hcs_p_sub_companions` / `emit_si_g_companions`, plus the
`emit_companion_required` dispatcher used by rows #2 and #5) all
migrated to `marque_rules::RuleId::new("capco", row.name)` —
constructed once per emit body at the entry and passed through to
`emit_companion_insert` / `make_fix_diagnostic` as the `rule`
argument.
`pub(crate) use` re-export of `RULE_E059` from `scheme/mod.rs`
removed accordingly.
Citations re-verified against `crates/capco/docs/CAPCO-2016.md` per
Constitution VIII; no `Citation` struct field touched.
`cargo check -p marque-capco` clean after this commit.
Refs: docs/refactor-006/legacy-rule-id-map.md §3 + §4,
docs/refactor-006/2026-05-22-T044-rule-id-tuple-plan.md §2.3 + §2.6,
docs/refactor-006/2026-05-22-T044-pm-decisions.md OD-8.A.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
empirically rejected) Copilot's PR #697 review surfaced three inline comments: **#1 (lib.rs:842) — accepted**: doc-comment claimed R2 "tightens the runtime-config surface relative to the derive". Copilot correctly pointed out that the pre-R2 `#[derive(Deserialize)]` on `Option<String>` / `Option<f32>` / `Option<HashMap<String,String>>` / `Option<f64>` was ALREADY type-strict and ALREADY had a closed accepted-field set. There is no concrete pre-R2 case that was accepted and is now rejected. The synthesis brief AND the rust- specialist review both misclaimed "tightening"; Copilot caught it. Reworded the `WasmConfig` doc-comment to state the post-R2 surface matches the pre-R2 Derive exactly — same four field names, same `Option<T>` shape, same loud-failure-on-mismatch, same silent-ignore. The R2 change is structural (binary-size cut + improved error messages with field names), NOT a tightening. **#3 (lib.rs:1104) — accepted**: f32 format-parse roundtrip comment claimed it "runs only on engine-cache-miss" but `build_cache_key` runs on every `parse_wasm_config` call (which is every lint_native / fix_native call), regardless of whether the resulting cache key hits or misses the engine cache. Corrected the comment. **#2 (lib.rs:1124) — empirically REJECTED**: Copilot suggested replacing the `serde_json::Map`-with-cloned-keys path with a borrow-preserving `Vec<(&str,&str)>` sort + direct JSON string construction, to eliminate per-call `corrections` key/value clones. The suggestion is theoretically correct on runtime: O(n) String clones become O(n) borrow-preserving sort comparisons. But empirically — measured locally with `wasm-pack build --target web --profile release-web` (CI-gate profile) — the borrow-preserving rewrite REGRESSED WASM bundle size from 1,364,312 B (original R2) to 1,370,786 B (with Copilot's suggested rewrite). That's a +6,474 B regression — would erase 83% of R2's −7,750 B binary win. The mechanism is the same LTO+ICF lesson captured in PR #696 (R1's `sorted_entries` revert at `crates/capco/src/lattice.rs:249-278`): the `serde_json::Map` / `Value::Object` / `to_string::<Value>` path REUSES serde_json machinery already monomorphized for the Diagnostic JSON output side of `marque-wasm`. The direct-String-construction path forces fresh monomorphizations of `serde_json::to_string::<str>` plus a longer push-based code path. Theoretical clone-count "wins" don't beat empirical binary-size measurements. The runtime cost Copilot flagged is real but small: typical corrections maps in marque usage are 5-20 entries; per-call cost is 10-40 short String allocations, in the noise relative to the ~400 µs lint hot path. The binary savings (6.5 KB) dwarf the runtime cost. Trade favors keeping the original implementation. Documented the trade in the `build_cache_key` doc-comment with a cross-reference to PR #696's empirical-revert precedent — future maintainers see WHY the seemingly-suboptimal clones are preserved. Also brought along: structural Some-and-non-empty proof via `.filter()` (rust-specialist nit from review fix-up b974550) stays in place; that one didn't depend on the body shape Copilot suggested. Empirical measurements (wasm-pack --profile release-web, local): - staging baseline: 1,372,062 B - original R2 (b974550, serde_json::Map): 1,364,312 B (−7,750) - Copilot #2 fix (direct String): 1,370,786 B (−1,276) ← rejected - this commit (revert + doc fixes): 1,364,386 B (−7,676) cargo +stable clippy --workspace --all-targets --all-features -- -D warnings: green cargo test -p marque-wasm --lib r2_: 19/19 green (byte-identity preserved)
…ref) Three small fixes from the 3-reviewer panel on the lattice split: 1. crates/capco/src/lattice/sar.rs:87,97 — replaced two `[`super::helpers::HierarchicalTreeSet::join_with`]` / `[::meet_with]` intra-doc links with prose. `HierarchicalTreeSet` is `pub(super)` and therefore not a valid public doc-link target; the bracketed form fired `rustdoc::broken_intra_doc_links`. Prose form: "Delegates to the internal `HierarchicalTreeSet::join_with` method in `super::helpers`." 2. crates/capco/src/lattice/sci.rs:181,199 — same fix at the two analogous SciSet sites. 3. crates/capco/src/lattice/nato_dissem.rs:35 — replaced the stale cross-file reference "See DissemSet doc above for rationale." (which was coherent in the monolith where DissemSet was defined ~300 lines above NatoDissemSet) with an explicit intra-doc link to `[`super::dissem::DissemSet`]` that resolves correctly post- split. Reviewers + verdicts: - Rust-specialist: APPROVE-WITH-FIXES (#1, #2 above; one MEDIUM `test_support` cfg-gate finding was incorrect — gate is present at `mod.rs:57` ahead of the `pub(crate) mod test_support;` line) - General reviewer: APPROVE-WITH-FIXES (#3 above; plus a non- blocking follow-up note about stale `lattice.rs:NNNN` line-number anchors in 6 integration-test prose comments — out of scope; will file as separate tracking issue) - Lattice-consultant: APPROVE (no fixes required; all algebraic-law invariants, citations, and 82-test parity verified intact) Verification post-fix: - `cargo +stable check --workspace --all-targets` PASS - `cargo +stable clippy --workspace --all-targets -- -D warnings` PASS - `cargo +stable fmt --all -- --check` PASS - `cargo test --workspace` PASS (198 test binaries, zero failures) - `cargo +stable doc -p marque-capco --no-deps`: 54 warnings (down from 59 pre-fix — the 4 new HierarchicalTreeSet intra-doc warnings are gone; all 54 remaining are pre-existing, none in `crates/capco/src/lattice/`)
…ref) Three small fixes from the 3-reviewer panel on the lattice split: 1. crates/capco/src/lattice/sar.rs:87,97 — replaced two `[`super::helpers::HierarchicalTreeSet::join_with`]` / `[::meet_with]` intra-doc links with prose. `HierarchicalTreeSet` is `pub(super)` and therefore not a valid public doc-link target; the bracketed form fired `rustdoc::broken_intra_doc_links`. Prose form: "Delegates to the internal `HierarchicalTreeSet::join_with` method in `super::helpers`." 2. crates/capco/src/lattice/sci.rs:181,199 — same fix at the two analogous SciSet sites. 3. crates/capco/src/lattice/nato_dissem.rs:35 — replaced the stale cross-file reference "See DissemSet doc above for rationale." (which was coherent in the monolith where DissemSet was defined ~300 lines above NatoDissemSet) with an explicit intra-doc link to `[`super::dissem::DissemSet`]` that resolves correctly post- split. Reviewers + verdicts: - Rust-specialist: APPROVE-WITH-FIXES (#1, #2 above; one MEDIUM `test_support` cfg-gate finding was incorrect — gate is present at `mod.rs:57` ahead of the `pub(crate) mod test_support;` line) - General reviewer: APPROVE-WITH-FIXES (#3 above; plus a non- blocking follow-up note about stale `lattice.rs:NNNN` line-number anchors in 6 integration-test prose comments — out of scope; will file as separate tracking issue) - Lattice-consultant: APPROVE (no fixes required; all algebraic-law invariants, citations, and 82-test parity verified intact) Verification post-fix: - `cargo +stable check --workspace --all-targets` PASS - `cargo +stable clippy --workspace --all-targets -- -D warnings` PASS - `cargo +stable fmt --all -- --check` PASS - `cargo test --workspace` PASS (198 test binaries, zero failures) - `cargo +stable doc -p marque-capco --no-deps`: 54 warnings (down from 59 pre-fix — the 4 new HierarchicalTreeSet intra-doc warnings are gone; all 54 remaining are pre-existing, none in `crates/capco/src/lattice/`)
…n + correct overclaiming header Copilot R2 #3: two issues in decoder_split_byte_identity.rs: (a) The "Byte-identity regression suite" header overclaimed. The test compares semantic Parsed<CapcoMarking> outcomes, not raw bytes — the engine doesn't surface canonicalized bytes at the recognizer boundary. Reworded to "Recognizer-outcome regression suite" with explicit "compares parsed shapes, not raw bytes" wording. (b) decoder_recovers_stray_char_slash_recovery used an "either branch is acceptable" match arm with `let _ = candidates;`, accepting any outcome including zero candidates. Probed actual behavior for (S/X/NF): the collapse-stray pass produces Unambiguous(Secret) cleanly. Replaced the defensive match with a strong assertion matching observed behavior: Parsed::Unambiguous + Classification::Secret + dissem_us_count == 1 (NOFORN preserved). 18/18 byte-identity tests pass with the strengthened assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…above the unconditional Unambiguous assertion Copilot R4 #3: my R2 commit (ce2c15f) strengthened the decoder_recovers_stray_char_slash_recovery match into an unconditional `let Parsed::Unambiguous(m) = parsed else { panic!() }` but left the original "even if the gate doesn't reach Unambiguous, a non-empty Ambiguous result is acceptable here" preamble in place. The comment now contradicts the assertion. Replaced the preamble with a single-paragraph explanation matching the actual behavior: the collapse-stray pass strips the X and the scorer reaches Unambiguous(Secret + NOFORN) cleanly; runner-up margin is wide enough that the null gate does not intervene. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Lands the foundational work for the marque MVP (branch
001-marque-mvp): the ratified feature spec, Phase 1 workspace restructure (newmarque-ismleaf crate + dev scaffolding), and Phase 2 (ODNI CVE XML→Rust codegen, typedIsmAttributes,FixProposal/AppliedFixtype split, FR-016 deterministic fix ordering with overlap guard, layered config with hard-fail validators, and a Clock-injected engine). Unblocks Phase 3 (US1 rules + CLI) where real spans will be wired through the scanner.Changes
Spec & contracts (
51c4aa7— 17 expert-panel findings)Audit-record reconciled with
AppliedFix(input,sourceenum,schema_version)FR-005a NDJSON atomic-write contract with error-frame recovery
FR-016 total-order fix application:
(span.end DESC, span.start DESC, rule_id ASC, replacement ASC)US2/US3 acceptance scenarios rewritten with literal-byte fixtures
SC-001 scoped to native; SC-001b advisory WASM latency bar
FR-008
severity=offinvariant: unrepresentable on emitted diagnosticsFR-010 (committed
[user]refusal) and FR-011 (schema version match) validatorsPhase 1 — workspace restructure (
71e1f11)New
marque-ismleaf crate owningSpan,IsmAttributes,TokenSet,CapcoTokenSet,build.rscodegen, and ODNI schemas (resolves the circular dependency betweenmarque-coreandmarque-capco)marque-rulesdepends directly onmarque-ismmarque-coreretains thin re-exports for back-compatTest corpus scaffolding (
CORPUS_CONTRACT.md,CORPUS_PROVENANCE.md)Criterion benchmark skeletons;
scripts/check.sh;.config/nextest.tomlPhase 1 review fixes (
a46086b— critical/high/medium)Guard
Engine::fixagainst placeholderSpan(0,0)(output-corruption prevention)Filter
remaining_diagnosticsto exclude already-applied fixesSpan::as_strreturnsResultinstead of panickingCoreError::InvalidUtf8, proper CRLF handling, full CAB scan, dedupCVE_TOKENSPartialOrd/OrdonRuleIdfor FR-016 sortPhase 2 — foundational crates (
dacea46)Real ODNI CVE XML and XSD parsing → typed Rust enums (
SciControl,DissemControl,SarIdentifier,DeclassExemption,ExemptFrom) +TRIGRAPHS+ sorted/deduplicatedALL_CVE_TOKENS+ validators + migrations, with T010 schema-version pinningmarque-rules:FixProposal(pure data) vsAppliedFix(engine-only promotion via__engine_promote);FixSourceenum;Severity::Offas first variant;Diagnosticcarries citationmarque-engine:EnginewithClockinjection,FixMode::{Apply, DryRun}, config-driven severity overrides, FR-016 sort + overlap guard (reverse-walk window) +HashSet<(RuleId, Span)>for O(1)remaining_diagnosticsfiltermarque-config: layered load with FR-010/FR-011 hard-fail,confidence_thresholdvalidation,MARQUE_CONFIDENCE_THRESHOLDenv bindingmarque-capco: E001 banner abbreviation, E002 missing/misordered USA trigraph, E004 separator count (stub), E005 declass in bannermarque-core/parser: uses generated enumparse()for SCI/dissem/SAR/declass;Classificationremains hand-written (portion + banner forms)New
Clocktrait +SystemClock/FixedClockfor deterministic audit timestamps in testsEngine::fix_with_thresholdplumbs--confidenceCLI flag and serverFixRequest.confidence_thresholdthrough withInvalidThresholdvalidationServer now calls
marque_config::load()(runs FR-011 check instead ofConfig::default()fallback)Silent-fallback cluster fixed:
MARQUE_CONFIDENCE_THRESHOLDparse failure →ConfigError::InvalidEnvVar; rule severity string typos →ConfigError::UnknownSeverity;current_dir()failure → exit 74Safety hardening:
Trigraphinner field private withtry_newASCII validator (infallibleas_strviafrom_utf8_unchecked+ SAFETY comment);RuleId/FixedClockfields private withnewconstructors;FixProposal::newconfidence check is release-modeassert!(notdebug_assert!);Span::newrelease-mode assert +try_as_slicechecked variantErgonomics/perf:
AppliedFix.classifier_idand.inputareArc<str>for O(1) clone;BatchEngine.lint_many/fix_manyyieldResult<_, BatchError>instead of aborting onJoinError;SeverityhasDisplay/as_str(server/CLI/wasm stop usingformat!(\"{:?}\"));canonicalizeusesbinary_searchover sortedALL_CVE_TOKENSBuild-script hardening:
resolve_identsasserts generated enum variants are non-empty and unique; XML/XSD unescape failures now panic with file context instead of silently truncatingFiles Changed
820 files changed, +7612 / -853.
The bulk of the file count is the ODNI ISM v2022-DEC schema bundle (CVE XML, XSD, Schematron, RNG) relocated into
crates/marque-ism/schemas/ISM-v2022-DEC/during Phase 1. The source/test churn is concentrated incrates/marque-{ism,rules,engine,capco,config,core,server,wasm}/src/andmarque/src/main.rs.| Area | Status |
|---|---|
|
specs/001-marque-mvp/(spec, contracts, data model, tasks) | Added/Modified ||
crates/marque-ism/(new crate + schemas + build.rs codegen) | Added ||
crates/marque-{rules,engine,capco,config,core}/src/| Modified ||
crates/marque-{server,wasm}/src/| Modified ||
marque/src/main.rs(CLI) | Modified ||
crates/marque-engine/src/clock.rs| Added ||
tests/corpus/(contract + provenance + fixture scaffolding) | Added ||
scripts/check.sh,.config/nextest.toml,benches/| Added |Testing
Unit tests: 45 new tests across
marque-config(8),marque-engine(7),marque-ism(17),marque-rules(10), plus 3 existingmarque-corescanner tests — zero failuresEngine coverage: FR-016 reverse-order application, C-1 overlap guard,
FixMode::DryRun,fix_with_thresholdvalidation (NaN, out-of-range, boundaries),FixedClockdeterminismConfig coverage:
set_confidence_thresholdboundaries + NaN rejection, rule severity string validation (case-sensitivity, empty, typos),exit_code()matrixType invariants:
Trigraph::try_newASCII enforcement,Span::newpanic on inverted bounds,Span::try_as_slicebounds safety,FixProposal::newrelease-mode confidence assert, generated CVE token sorted-and-unique invariantStatic checks:
cargo clippy --workspace --all-targets -- -D warningsclean;cargo fmt --checkcleanBuild:
cargo check --workspaceandcargo build --workspaceboth succeedTest plan
CI runs
cargo check --workspaceCI runs
cargo clippy --workspace --all-targets -- -D warningsCI runs
cargo test --workspace(orcargo nextest run --workspace)CI runs
cargo fmt --checkManual:
marque check <file>on a fixture with a banner marking exits 0 when cleanManual:
marque fix --dry-run <file>reports would-apply count without mutating sourceManual:
MARQUE_CONFIDENCE_THRESHOLD=bananas marque check xexits 65 with clear errorManual:
.marque.tomlwithbanner-abbreviation = "err"(typo) exits 65 with clear errorRelated Issues
None linked.
Notes for Reviewers
PR size: This is a large PR because it bundles the ratified spec, Phase 1 workspace restructure, and Phase 2 foundational crates. The schema file count inflates the stat; the code review surface is the
crates/marque-*/src/andmarque/src/main.rsdiffs, plusbuild.rs.Phase 2 review artifact: The full local review report (4 CRITICAL + 10 HIGH + 10 MEDIUM + 9 LOW) is at
.claude/PRPs/reviews/phase-2-local-review.mdin the working tree (not committed). All 14 prioritized findings are resolved in the Phase 2 commit — see the commit body for the full mapping.Known Phase 3 TODOs left in code:
RuleContextzone/position are hardcoded toBodywith aTODO(phase-3)marker; rule fixes emitSpan::new(0, 0)placeholders that are filtered by the engine'sis_empty()guard until Phase 3 wires real per-token spans from the scanner. The C-1 overlap guard landed now specifically to protect Phase 3's real spans.🤖 Generated with Claude Code