Phase 5 PR-3: trait-surface completion (T078 + T079 + T089b) by bashandbone · Pull Request #146 · marquetools/marque

bashandbone · 2026-04-25T15:29:47Z

Summary

Completes Phase 5 (US3) with three trait-surface-completion tests:

T078 — Codec<S> compile-test asserting the trait exists + is object-safe + every CodecError variant displays
T079 — Migration-audit URN provenance test using the Vocabulary surface to recover URNs from an applied NF → NOFORN fix
T089b — StubScheme readiness test exercising every Phase-E trait without any engine-side import (SC-010 pre-verification)

Closes Phase 5; Phase 6 polish unblocked.

T078 — `crates/scheme/tests/codec_surface.rs`

Local MockScheme (no marque-capco dependency) so the compile test stands alone. Two generic acceptors prove Codec<S> is namable as &dyn Codec<S> and <C: Codec<S>>. Runtime sanity check exercises every CodecError::Display variant.

T079 — `crates/engine/tests/audit.rs::migration_audit_has_both_urns`

Runs E001 (NF → NOFORN) on b\"SECRET//NF\", captures the AppliedFix, and verifies URN provenance is recoverable through Vocabulary<CapcoScheme>::metadata(&TOK_NOFORN):

metadata.urn non-empty, ODNI-prefixed (urn:us:gov:ic:cvenum:)
metadata.canonical == \"NF\" + metadata.banner_form == \"NOFORN\" pinned
metadata.authority.urn == metadata.urn (single-source-of-truth invariant from PR-2 round-1 review)
Vacuity guard: panics with applied-rule list if E001 doesn't fire

Architectural decision: URN provenance is recoverable, not embedded

URN provenance lives in the audit record + public Vocabulary surface — NOT as a separate field on AppliedFix. Adding fields would either bump MARQUE_AUDIT_SCHEMA to marque-mvp-3 (full schema revision) or require non-back-compat optional-field extensions on the v2 emitter. Both are out of scope for trait-surface completion.

The recoverability approach matches Constitution V's "audit records contain enough information for an auditor to reconstruct provenance" principle: the auditor composes AppliedFix.proposal.{original, replacement} with the public Vocabulary trait to recover URNs at consumption time.

If a future PR genuinely needs URNs as JSON fields, the right move is an audit-schema bump to marque-mvp-3 with a contracts/audit-record-v3.md spec landing first.

T089b — `crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits`

Minimal StubScheme implementing every Phase-E trait:

MarkingScheme (10 required methods)
Vocabulary<StubScheme> (8 &'static accessors)
Codec<StubScheme> (encode/decode + CodecError::SchemaMismatch exercise)
Recognizer<StubScheme> (zero-candidate Parsed::Ambiguous safe answer)
One Constraint::Conflicts declarative entry
One PageRewrite::declarative rewrite

Imports cover only marque_scheme::* + std::* — Constitution VII verification at the import level. const _: fn() = || { ... } binding declares each Phase-E trait is namable generically.

SC-010 pre-verification: a future trait change that requires engine-internal types fails to compile here BEFORE Phase F adoption begins. The "shallow adapter" promise from Constitution IV becomes a build-time gate.

Test plan

cargo test --workspace — 1184 passed, 0 failed (was 1180; +4 new tests)
cargo test -p marque-scheme --test codec_surface — 2 passed
cargo test -p marque-scheme --test adoption_readiness — 1 passed
cargo test -p marque-engine --test audit — 10 passed (was 9; +1)
cargo clippy --workspace --all-targets -- -D warnings clean

Phase 5 status

Task	Status	PR
T071–T077	✅	#143
T077a (coverage)	✅	#143
T080–T082 (build.rs)	✅	#141
T083 (FOUO removal)	✅	(Phase E doc-comment)
T084 (Vocabulary impl)	✅	#143
T085 (corpus byte-identical)	✅	#143
T078 (Codec compile-test)	✅ this PR	#145
T079 (audit URN provenance)	✅ this PR	#145
T089b (StubScheme readiness)	✅ this PR	#145

Phase 5 (US3) complete. Phase 6 polish unblocked.

🤖 Generated with Claude Code

Final Phase 5 PR landing the three trait-surface-completion tests that complete Phase 5 (US3) ahead of Phase 6 polish. ## T078 — Codec surface compile test `crates/scheme/tests/codec_surface.rs`. Asserts `Codec<S>` and `CodecError` compile cleanly with no concrete impls. Defines a local `MockScheme` (no dependency on `marque-capco`); declares two generic acceptors (`accepts_codec_by_trait_object`, `accepts_codec_generically`) so a refactor that breaks object- safety or drops the trait fails compile here. Companion test pins `Parsed::Ambiguous { candidates: vec![] }` as the canonical zero-candidate decode contract per `Codec::decode` doc-comment. ## T079 — migration-audit URN provenance `crates/engine/tests/audit.rs::migration_audit_has_both_urns`. Runs an E001 portion-mark-in-banner fix on `b"SECRET//NF"`, captures the `AppliedFix`, and verifies URN provenance is recoverable through `Vocabulary<CapcoScheme>::metadata(&TOK_NOFORN)`. Asserts: - `original` = "NF", `replacement` = "NOFORN" (canonical form pair). - `metadata.urn` is non-empty and starts with `urn:us:gov:ic:cvenum:`. - `metadata.canonical == "NF"` and `metadata.banner_form == "NOFORN"` (pin against future ODNI rename). - `metadata.authority.urn == metadata.urn` (single-source-of-truth invariant from PR-2 review round 1). - `metadata.authority.schema_version == "ISM-v2022-DEC"`. Vacuity guard panics with rule list if E001 doesn't fire on the banner-shaped input. **Architectural decision**: URN provenance is *recoverable* from the audit record + public Vocabulary surface, NOT a separate field on `AppliedFix`. Adding fields would either bump `MARQUE_AUDIT_SCHEMA` to `marque-mvp-3` (full schema rev) or require non-back-compat `skip_serializing_if` extensions on the v2 emitter. Both are out of scope for this PR. The reconstructibility approach matches Constitution V's "audit records contain enough information to reconstruct provenance" principle: the audit consumer composes `AppliedFix.proposal.{original, replacement}` with the public `Vocabulary` trait to recover URNs at audit-consumption time. ## T089b — StubScheme readiness `crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits`. Defines a minimal `StubScheme` that implements every Phase-E trait: - `MarkingScheme` (10 required methods) - `Vocabulary<StubScheme>` (8 accessors with `&'static` returns) - `Codec<StubScheme>` (encode/decode + `CodecError::SchemaMismatch` exercise) - `Recognizer<StubScheme>` (zero-candidate Ambiguous safe answer) - One `Constraint::Conflicts` declarative entry - One `PageRewrite::declarative` rewrite (with `&'static [CategoryId]` axes) Imports cover only `marque_scheme::*` + `std::*` — Constitution VII verification at the import level. A `const _: fn() = || { ... }` binding declares each Phase-E trait can be named generically. **SC-010 pre-verification**: a future trait change that requires engine-internal types fails to compile here BEFORE Phase F adoption begins. The "shallow adapter" promise from Constitution IV becomes a build-time gate, not a hope. ## What this PR does NOT do T079's "URNs in the audit record" interpretation deliberately avoids modifying: - `AppliedFix` data model (would bump audit schema) - `FixSource::MigrationTable` payload (would ripple to every emitter) - `marque-engine` ↔ `marque-capco` dependency edge (Constitution VII forbids; engine remains generic over `MarkingScheme`) If a future PR genuinely needs URNs as JSON fields, the right move is an audit-schema bump to `marque-mvp-3` with a corresponding contracts/audit-record-v3.md spec landing first. ## Verification - `cargo test --workspace` — 1184 passed, 0 failed (was 1180; +4) - `cargo test -p marque-scheme --test codec_surface` — 2 passed - `cargo test -p marque-scheme --test adoption_readiness` — 1 passed - `cargo test -p marque-engine --test audit` — 10 passed (was 9; +1) - `cargo clippy --workspace --all-targets -- -D warnings` clean ## Phase 5 status T078, T079, T084, T085, T089b all landed. T080–T083 + T071–T077a landed in PRs 141–143. Phase 5 (US3) is complete — Phase 6 polish unblocked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-25T15:29:56Z

🤖 Hi @bashandbone, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions · 2026-04-25T15:30:10Z

🤖 I'm sorry @bashandbone, but I was unable to process your request. Please see the logs for more details.

1. **codec_surface.rs:138 — inaccurate "no impls anywhere in marque-scheme"** The claim was wrong: T089b's `tests/adoption_readiness.rs` declares a `StubCodec` impl. Reworded to scope the claim to the `marque_scheme` library (`src/`), with an explicit note that test-side fixture scaffolding is intentional and outside the library boundary. 2. **audit.rs:817 — duplicate `source_urn`/`replacement_urn` from same source** Both URN variables were set to `metadata.urn` literally, so the downstream non-empty/prefix assertions couldn't catch a divergence between the canonical and banner-form lookups. Rewrote the test to derive the two URNs through INDEPENDENT recovery paths: - **Path 1** (source): `lookup_token_metadata("NF")` — direct string-keyed lookup of the canonical CVE value, returning the URN via `cve_file.urn`. - **Path 2** (replacement): `marking_forms::banner_to_portion("NOFORN")` → "NF" → `lookup_token_metadata("NF")` → URN via `cve_file.urn`. This is the audit-consumer's full recovery path: banner-form strings aren't in TOKEN_METADATA, so consumers must round-trip through `marking_forms` first. Then asserts: - Both URNs trace to ODNI's `urn:us:gov:ic:cvenum:` prefix. - `source_urn == replacement_urn` (NF and NOFORN are forms of the same CVE entry, so URNs match by construction). - The Vocabulary trait surface agrees with the string-keyed lookup (typed and untyped paths converge). A future ODNI release that splits NOFORN into a separate entry now makes this test loud — the equality assertion catches the divergence the previous version couldn't. ## Verification - `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed - `cargo test -p marque-scheme --test codec_surface` — 2 passed - `cargo test --workspace` — 1184 passed, 0 failed - `cargo clippy --workspace --all-targets -- -D warnings` clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

1. **audit.rs:908 — hardcoded schema version** `metadata.authority.schema_version == "ISM-v2022-DEC"` baked the schema version literal into the test. Switched to `marque_ism::SCHEMA_VERSION` so a future schema-package bump updates the assertion automatically while still enforcing the invariant that the authority schema version matches the active bundle. 2. **adoption_readiness.rs:128 — collapsed `name`/`label` fields** `Constraint::Conflicts` carries two semantically distinct fields: `name` is the stable identifier (rule-style id), `label` is the authoritative-source citation (e.g. "CAPCO-2016 §H.4"). The stub had both set to "stub/conflicts" — Phase F adopters copying this fixture would get the wrong shape by default. Split them into `name: "stub/conflicts"` (id) and `label: "StubScheme readiness fixture (no real citation)"` (citation-shaped placeholder), with a comment documenting the distinction so the discipline is visible to future readers. 3. **audit.rs:743 — misleading T079 header comment** The original header claimed URN provenance is recovered by "composing the audit record with the public `Vocabulary<S>` surface". The actual recovery path runs through `marque_ism::generated::vocabulary::lookup_token_metadata` and `marque_ism::marking_forms::banner_to_portion` — string-keyed `marque-ism` public tables. `Vocabulary<S>` is the TYPED accessor (`TokenId`-keyed), used as a cross-check, not as the primary recovery surface. Why the distinction matters: audit consumers receive serialized records (strings, not typed `TokenId`s), so the recovery path they actually exercise is the string-keyed one. The `Vocabulary<S>` trait is for rule code that already has a typed token in hand. Rewrote the comment to make this honest: separate "Recovery path (string-keyed)" and "Cross-check: Vocabulary<S> agrees" sections. The architectural gate now describes what the test actually verifies. ## Verification - `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed - `cargo test -p marque-scheme --test adoption_readiness` — 1 passed - `cargo test --workspace` — 1184 passed, 0 failed - `cargo clippy --workspace --all-targets -- -D warnings` clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-04-25T15:54:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

Adds Phase 5 trait-surface completion coverage by introducing compile-/readiness tests in marque-scheme and an audit URN-provenance regression test in marque-engine, plus marking the tasks as complete in the spec tracker.

Changes:

Add Codec<S> surface compile/object-safety coverage and CodecError::Display variant coverage (crates/scheme/tests/codec_surface.rs).
Add a Phase-F “second scheme” readiness fixture implementing Phase-E traits without engine imports (crates/scheme/tests/adoption_readiness.rs).
Add an engine audit test that recovers ODNI URN provenance for the NF → NOFORN applied fix via public vocabulary lookup surfaces (crates/engine/tests/audit.rs).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
specs/004-constraints-decoder-vocab/tasks.md	Marks T078/T079/T089b as landed with implementation notes.
crates/scheme/tests/codec_surface.rs	New integration tests pinning `Codec<S>` trait usability and `CodecError` display coverage.
crates/scheme/tests/adoption_readiness.rs	New readiness test defining a `StubScheme` implementing the Phase-E trait surface without engine dependencies.
crates/engine/tests/audit.rs	New audit regression test validating URN provenance recoverability for E001 (`NF → NOFORN`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    // The URN provenance also lives on the embedded `Authority`
+    // record; verify it agrees with the top-level URN.
+    assert_eq!(
+        metadata.authority.urn, source_urn,
+        "metadata.authority.urn must match metadata.urn — single source of truth",


Copilot flagged `audit.rs:925`: the assertion `metadata.authority.urn == metadata.urn` was framed as a "single source of truth" invariant, but `Authority::urn` is documented in `marque-scheme` as the URN of the publishing authority (example `urn:us:gov:ic:ism` — system root), which a scheme could legitimately populate at coarser granularity than a token's URN. The equality holds for CAPCO only because CAPCO's adapter chose per-CVE-file granularity (`urn:us:gov:ic:cvenum:ism:dissem`); it is not a universal `Vocabulary<S>` contract. Both fixes Copilot suggested are applied: 1. **(a) Test scoping** — rewrote the assertion's failure message to call out the CAPCO-adapter-specific provenance: per-CVE-file URN granularity makes authority.urn equal token.urn for every CAPCO token, but a future non-CAPCO scheme (CUI, NATO, JOINT) may legitimately have them differ. 2. **(b) Field doc tightening** — updated `Authority::urn` doc in `crates/scheme/src/vocabulary.rs` to acknowledge per-scheme granularity. The doc now explicitly states a scheme MAY use a coarse system-root URN OR a fine per-publishing-file URN, with CAPCO documented as using the latter. Audit consumers told not to assume a relationship between `Authority::urn` and a token's own URN beyond what the scheme's documentation states. The test passes unchanged — the assertion's behavior is unchanged; only the framing (and the underlying field's documentation) is clarified. ## Verification - `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed - `cargo test --workspace` — 1184 passed, 0 failed - `cargo clippy --workspace --all-targets -- -D warnings` clean - `cargo fmt --all` applied Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… close PR-4 tasks bookkeeping (#542) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> closed in PR 4a / #422 (not Phase 5 PR-2 / #146 as

bashandbone requested review from a team and Copilot April 25, 2026 15:29

Copilot started reviewing on behalf of bashandbone April 25, 2026 15:30 View session

This comment was marked as resolved.

Sign in to view

bashandbone requested a review from Copilot April 25, 2026 15:39

Copilot started reviewing on behalf of bashandbone April 25, 2026 15:39 View session

This comment was marked as resolved.

Sign in to view

bashandbone and others added 2 commits April 25, 2026 11:46

fix: cargo fmt

ffa936a

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bashandbone requested a review from Copilot April 25, 2026 15:50

Copilot started reviewing on behalf of bashandbone April 25, 2026 15:51 View session

fix: spelling in adoption_readiness.rs

b3f8550

Copilot AI reviewed Apr 25, 2026

View reviewed changes

bashandbone merged commit 2c80141 into main Apr 25, 2026
12 checks passed

bashandbone deleted the feat/phase5-pr3-trait-surface-completion branch April 25, 2026 15:59

This was referenced Apr 29, 2026

feat(api): vocabulary lookup surface for editor/API integrations #254

Open

feat(docs): per-token authoritative help text (hover-on-token, API consumable) #255

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phase 5 PR-3: trait-surface completion (T078 + T079 + T089b)#146

Phase 5 PR-3: trait-surface completion (T078 + T079 + T089b)#146
bashandbone merged 6 commits into
mainfrom
feat/phase5-pr3-trait-surface-completion

bashandbone commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

codecov Bot commented Apr 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bashandbone commented Apr 25, 2026

Summary

T078 — crates/scheme/tests/codec_surface.rs

T079 — crates/engine/tests/audit.rs::migration_audit_has_both_urns

Architectural decision: URN provenance is recoverable, not embedded

T089b — crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits

Test plan

Phase 5 status

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

codecov Bot commented Apr 25, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

T078 — `crates/scheme/tests/codec_surface.rs`

T079 — `crates/engine/tests/audit.rs::migration_audit_has_both_urns`

T089b — `crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits`