Skip to content

Phase 5 PR-3: trait-surface completion (T078 + T079 + T089b)#146

Merged
bashandbone merged 6 commits into
mainfrom
feat/phase5-pr3-trait-surface-completion
Apr 25, 2026
Merged

Phase 5 PR-3: trait-surface completion (T078 + T079 + T089b)#146
bashandbone merged 6 commits into
mainfrom
feat/phase5-pr3-trait-surface-completion

Conversation

@bashandbone
Copy link
Copy Markdown
Collaborator

Summary

Completes Phase 5 (US3) with three trait-surface-completion tests:

  • T078Codec<S> compile-test asserting the trait exists + is object-safe + every CodecError variant displays
  • T079 — Migration-audit URN provenance test using the Vocabulary surface to recover URNs from an applied NF → NOFORN fix
  • T089bStubScheme readiness test exercising every Phase-E trait without any engine-side import (SC-010 pre-verification)

Closes Phase 5; Phase 6 polish unblocked.

T078 — crates/scheme/tests/codec_surface.rs

Local MockScheme (no marque-capco dependency) so the compile test stands alone. Two generic acceptors prove Codec<S> is namable as &dyn Codec<S> and <C: Codec<S>>. Runtime sanity check exercises every CodecError::Display variant.

T079 — crates/engine/tests/audit.rs::migration_audit_has_both_urns

Runs E001 (NF → NOFORN) on b\"SECRET//NF\", captures the AppliedFix, and verifies URN provenance is recoverable through Vocabulary<CapcoScheme>::metadata(&TOK_NOFORN):

  • metadata.urn non-empty, ODNI-prefixed (urn:us:gov:ic:cvenum:)
  • metadata.canonical == \"NF\" + metadata.banner_form == \"NOFORN\" pinned
  • metadata.authority.urn == metadata.urn (single-source-of-truth invariant from PR-2 round-1 review)
  • Vacuity guard: panics with applied-rule list if E001 doesn't fire

Architectural decision: URN provenance is recoverable, not embedded

URN provenance lives in the audit record + public Vocabulary surface — NOT as a separate field on AppliedFix. Adding fields would either bump MARQUE_AUDIT_SCHEMA to marque-mvp-3 (full schema revision) or require non-back-compat optional-field extensions on the v2 emitter. Both are out of scope for trait-surface completion.

The recoverability approach matches Constitution V's "audit records contain enough information for an auditor to reconstruct provenance" principle: the auditor composes AppliedFix.proposal.{original, replacement} with the public Vocabulary trait to recover URNs at consumption time.

If a future PR genuinely needs URNs as JSON fields, the right move is an audit-schema bump to marque-mvp-3 with a contracts/audit-record-v3.md spec landing first.

T089b — crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits

Minimal StubScheme implementing every Phase-E trait:

  • MarkingScheme (10 required methods)
  • Vocabulary<StubScheme> (8 &'static accessors)
  • Codec<StubScheme> (encode/decode + CodecError::SchemaMismatch exercise)
  • Recognizer<StubScheme> (zero-candidate Parsed::Ambiguous safe answer)
  • One Constraint::Conflicts declarative entry
  • One PageRewrite::declarative rewrite

Imports cover only marque_scheme::* + std::* — Constitution VII verification at the import level. const _: fn() = || { ... } binding declares each Phase-E trait is namable generically.

SC-010 pre-verification: a future trait change that requires engine-internal types fails to compile here BEFORE Phase F adoption begins. The "shallow adapter" promise from Constitution IV becomes a build-time gate.

Test plan

  • cargo test --workspace — 1184 passed, 0 failed (was 1180; +4 new tests)
  • cargo test -p marque-scheme --test codec_surface — 2 passed
  • cargo test -p marque-scheme --test adoption_readiness — 1 passed
  • cargo test -p marque-engine --test audit — 10 passed (was 9; +1)
  • cargo clippy --workspace --all-targets -- -D warnings clean

Phase 5 status

Task Status PR
T071–T077 #143
T077a (coverage) #143
T080–T082 (build.rs) #141
T083 (FOUO removal) (Phase E doc-comment)
T084 (Vocabulary impl) #143
T085 (corpus byte-identical) #143
T078 (Codec compile-test) ✅ this PR #145
T079 (audit URN provenance) ✅ this PR #145
T089b (StubScheme readiness) ✅ this PR #145

Phase 5 (US3) complete. Phase 6 polish unblocked.

🤖 Generated with Claude Code

Final Phase 5 PR landing the three trait-surface-completion tests
that complete Phase 5 (US3) ahead of Phase 6 polish.

## T078 — Codec surface compile test

`crates/scheme/tests/codec_surface.rs`. Asserts `Codec<S>` and
`CodecError` compile cleanly with no concrete impls. Defines a
local `MockScheme` (no dependency on `marque-capco`); declares two
generic acceptors (`accepts_codec_by_trait_object`,
`accepts_codec_generically`) so a refactor that breaks object-
safety or drops the trait fails compile here. Companion test
pins `Parsed::Ambiguous { candidates: vec![] }` as the canonical
zero-candidate decode contract per `Codec::decode` doc-comment.

## T079 — migration-audit URN provenance

`crates/engine/tests/audit.rs::migration_audit_has_both_urns`.
Runs an E001 portion-mark-in-banner fix on `b"SECRET//NF"`,
captures the `AppliedFix`, and verifies URN provenance is
recoverable through `Vocabulary<CapcoScheme>::metadata(&TOK_NOFORN)`.

Asserts:
- `original` = "NF", `replacement` = "NOFORN" (canonical form pair).
- `metadata.urn` is non-empty and starts with `urn:us:gov:ic:cvenum:`.
- `metadata.canonical == "NF"` and `metadata.banner_form == "NOFORN"`
  (pin against future ODNI rename).
- `metadata.authority.urn == metadata.urn` (single-source-of-truth
  invariant from PR-2 review round 1).
- `metadata.authority.schema_version == "ISM-v2022-DEC"`.

Vacuity guard panics with rule list if E001 doesn't fire on the
banner-shaped input.

**Architectural decision**: URN provenance is *recoverable* from the
audit record + public Vocabulary surface, NOT a separate field on
`AppliedFix`. Adding fields would either bump `MARQUE_AUDIT_SCHEMA`
to `marque-mvp-3` (full schema rev) or require non-back-compat
`skip_serializing_if` extensions on the v2 emitter. Both are out
of scope for this PR. The reconstructibility approach matches
Constitution V's "audit records contain enough information to
reconstruct provenance" principle: the audit consumer composes
`AppliedFix.proposal.{original, replacement}` with the public
`Vocabulary` trait to recover URNs at audit-consumption time.

## T089b — StubScheme readiness

`crates/scheme/tests/adoption_readiness.rs::second_scheme_builds_without_engine_edits`.

Defines a minimal `StubScheme` that implements every Phase-E trait:
- `MarkingScheme` (10 required methods)
- `Vocabulary<StubScheme>` (8 accessors with `&'static` returns)
- `Codec<StubScheme>` (encode/decode + `CodecError::SchemaMismatch`
  exercise)
- `Recognizer<StubScheme>` (zero-candidate Ambiguous safe answer)
- One `Constraint::Conflicts` declarative entry
- One `PageRewrite::declarative` rewrite (with `&'static [CategoryId]`
  axes)

Imports cover only `marque_scheme::*` + `std::*` — Constitution VII
verification at the import level. A `const _: fn() = ||  { ... }`
binding declares each Phase-E trait can be named generically.

**SC-010 pre-verification**: a future trait change that requires
engine-internal types fails to compile here BEFORE Phase F adoption
begins. The "shallow adapter" promise from Constitution IV becomes
a build-time gate, not a hope.

## What this PR does NOT do

T079's "URNs in the audit record" interpretation deliberately avoids
modifying:

- `AppliedFix` data model (would bump audit schema)
- `FixSource::MigrationTable` payload (would ripple to every
  emitter)
- `marque-engine` ↔ `marque-capco` dependency edge (Constitution VII
  forbids; engine remains generic over `MarkingScheme`)

If a future PR genuinely needs URNs as JSON fields, the right move
is an audit-schema bump to `marque-mvp-3` with a corresponding
contracts/audit-record-v3.md spec landing first.

## Verification

- `cargo test --workspace` — 1184 passed, 0 failed (was 1180; +4)
- `cargo test -p marque-scheme --test codec_surface` — 2 passed
- `cargo test -p marque-scheme --test adoption_readiness` — 1 passed
- `cargo test -p marque-engine --test audit` — 10 passed (was 9; +1)
- `cargo clippy --workspace --all-targets -- -D warnings` clean

## Phase 5 status

T078, T079, T084, T085, T089b all landed. T080–T083 + T071–T077a
landed in PRs 141–143. Phase 5 (US3) is complete — Phase 6 polish
unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bashandbone bashandbone requested review from a team and Copilot April 25, 2026 15:29
@github-actions
Copy link
Copy Markdown

🤖 Hi @bashandbone, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@github-actions
Copy link
Copy Markdown

🤖 I'm sorry @bashandbone, but I was unable to process your request. Please see the logs for more details.

This comment was marked as resolved.

1. **codec_surface.rs:138 — inaccurate "no impls anywhere in marque-scheme"**
   The claim was wrong: T089b's `tests/adoption_readiness.rs` declares a
   `StubCodec` impl. Reworded to scope the claim to the `marque_scheme`
   library (`src/`), with an explicit note that test-side fixture
   scaffolding is intentional and outside the library boundary.

2. **audit.rs:817 — duplicate `source_urn`/`replacement_urn` from same source**
   Both URN variables were set to `metadata.urn` literally, so the
   downstream non-empty/prefix assertions couldn't catch a divergence
   between the canonical and banner-form lookups. Rewrote the test to
   derive the two URNs through INDEPENDENT recovery paths:

   - **Path 1** (source): `lookup_token_metadata("NF")` — direct
     string-keyed lookup of the canonical CVE value, returning the
     URN via `cve_file.urn`.
   - **Path 2** (replacement): `marking_forms::banner_to_portion("NOFORN")`
     → "NF" → `lookup_token_metadata("NF")` → URN via `cve_file.urn`.
     This is the audit-consumer's full recovery path: banner-form
     strings aren't in TOKEN_METADATA, so consumers must round-trip
     through `marking_forms` first.

   Then asserts:
   - Both URNs trace to ODNI's `urn:us:gov:ic:cvenum:` prefix.
   - `source_urn == replacement_urn` (NF and NOFORN are forms of the
     same CVE entry, so URNs match by construction).
   - The Vocabulary trait surface agrees with the string-keyed lookup
     (typed and untyped paths converge).

   A future ODNI release that splits NOFORN into a separate entry now
   makes this test loud — the equality assertion catches the
   divergence the previous version couldn't.

## Verification

- `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed
- `cargo test -p marque-scheme --test codec_surface` — 2 passed
- `cargo test --workspace` — 1184 passed, 0 failed
- `cargo clippy --workspace --all-targets -- -D warnings` clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This comment was marked as resolved.

bashandbone and others added 2 commits April 25, 2026 11:46
1. **audit.rs:908 — hardcoded schema version**
   `metadata.authority.schema_version == "ISM-v2022-DEC"` baked the
   schema version literal into the test. Switched to
   `marque_ism::SCHEMA_VERSION` so a future schema-package bump
   updates the assertion automatically while still enforcing the
   invariant that the authority schema version matches the active
   bundle.

2. **adoption_readiness.rs:128 — collapsed `name`/`label` fields**
   `Constraint::Conflicts` carries two semantically distinct
   fields: `name` is the stable identifier (rule-style id), `label`
   is the authoritative-source citation (e.g. "CAPCO-2016 §H.4").
   The stub had both set to "stub/conflicts" — Phase F adopters
   copying this fixture would get the wrong shape by default. Split
   them into `name: "stub/conflicts"` (id) and
   `label: "StubScheme readiness fixture (no real citation)"`
   (citation-shaped placeholder), with a comment documenting the
   distinction so the discipline is visible to future readers.

3. **audit.rs:743 — misleading T079 header comment**
   The original header claimed URN provenance is recovered by
   "composing the audit record with the public `Vocabulary<S>`
   surface". The actual recovery path runs through
   `marque_ism::generated::vocabulary::lookup_token_metadata` and
   `marque_ism::marking_forms::banner_to_portion` — string-keyed
   `marque-ism` public tables. `Vocabulary<S>` is the TYPED
   accessor (`TokenId`-keyed), used as a cross-check, not as the
   primary recovery surface.

   Why the distinction matters: audit consumers receive serialized
   records (strings, not typed `TokenId`s), so the recovery path
   they actually exercise is the string-keyed one. The
   `Vocabulary<S>` trait is for rule code that already has a typed
   token in hand.

   Rewrote the comment to make this honest: separate "Recovery
   path (string-keyed)" and "Cross-check: Vocabulary<S> agrees"
   sections. The architectural gate now describes what the test
   actually verifies.

## Verification

- `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed
- `cargo test -p marque-scheme --test adoption_readiness` — 1 passed
- `cargo test --workspace` — 1184 passed, 0 failed
- `cargo clippy --workspace --all-targets -- -D warnings` clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Phase 5 trait-surface completion coverage by introducing compile-/readiness tests in marque-scheme and an audit URN-provenance regression test in marque-engine, plus marking the tasks as complete in the spec tracker.

Changes:

  • Add Codec<S> surface compile/object-safety coverage and CodecError::Display variant coverage (crates/scheme/tests/codec_surface.rs).
  • Add a Phase-F “second scheme” readiness fixture implementing Phase-E traits without engine imports (crates/scheme/tests/adoption_readiness.rs).
  • Add an engine audit test that recovers ODNI URN provenance for the NF → NOFORN applied fix via public vocabulary lookup surfaces (crates/engine/tests/audit.rs).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
specs/004-constraints-decoder-vocab/tasks.md Marks T078/T079/T089b as landed with implementation notes.
crates/scheme/tests/codec_surface.rs New integration tests pinning Codec<S> trait usability and CodecError display coverage.
crates/scheme/tests/adoption_readiness.rs New readiness test defining a StubScheme implementing the Phase-E trait surface without engine dependencies.
crates/engine/tests/audit.rs New audit regression test validating URN provenance recoverability for E001 (NF → NOFORN).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/engine/tests/audit.rs Outdated
Comment on lines +921 to +925
// The URN provenance also lives on the embedded `Authority`
// record; verify it agrees with the top-level URN.
assert_eq!(
metadata.authority.urn, source_urn,
"metadata.authority.urn must match metadata.urn — single source of truth",
Copilot flagged `audit.rs:925`: the assertion
`metadata.authority.urn == metadata.urn` was framed as a "single
source of truth" invariant, but `Authority::urn` is documented in
`marque-scheme` as the URN of the publishing authority (example
`urn:us:gov:ic:ism` — system root), which a scheme could
legitimately populate at coarser granularity than a token's URN.
The equality holds for CAPCO only because CAPCO's adapter chose
per-CVE-file granularity (`urn:us:gov:ic:cvenum:ism:dissem`); it
is not a universal `Vocabulary<S>` contract.

Both fixes Copilot suggested are applied:

1. **(a) Test scoping** — rewrote the assertion's failure message
   to call out the CAPCO-adapter-specific provenance: per-CVE-file
   URN granularity makes authority.urn equal token.urn for every
   CAPCO token, but a future non-CAPCO scheme (CUI, NATO, JOINT)
   may legitimately have them differ.

2. **(b) Field doc tightening** — updated `Authority::urn` doc in
   `crates/scheme/src/vocabulary.rs` to acknowledge per-scheme
   granularity. The doc now explicitly states a scheme MAY use a
   coarse system-root URN OR a fine per-publishing-file URN, with
   CAPCO documented as using the latter. Audit consumers told not
   to assume a relationship between `Authority::urn` and a token's
   own URN beyond what the scheme's documentation states.

The test passes unchanged — the assertion's behavior is unchanged;
only the framing (and the underlying field's documentation) is
clarified.

## Verification

- `cargo test -p marque-engine --test audit migration_audit_has_both_urns` — 1 passed
- `cargo test --workspace` — 1184 passed, 0 failed
- `cargo clippy --workspace --all-targets -- -D warnings` clean
- `cargo fmt --all` applied

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bashandbone bashandbone merged commit 2c80141 into main Apr 25, 2026
12 checks passed
@bashandbone bashandbone deleted the feat/phase5-pr3-trait-surface-completion branch April 25, 2026 15:59
bashandbone added a commit that referenced this pull request May 18, 2026
… close PR-4 tasks bookkeeping (#542)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
closed in PR 4a / #422 (not Phase 5 PR-2 / #146 as
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants