Skip to content

RFC: CAPCO/ISM coupling in domain-neutral infrastructure blocks second grammar adoption #641

@bashandbone

Description

@bashandbone

Summary

Marque's stated architecture is grammar-agnostic: marque-scheme, marque-rules, and marque-engine carry no CAPCO-specific logic. A systematic audit of all seven crates (engine, core, scheme, rules, WASM, CLI, server) finds this claim is partially false. CAPCO/ISM concerns have infiltrated domain-neutral infrastructure at multiple severity levels, ranging from type-system blockers that make a second grammar impossible to register, through API naming that carries CAPCO semantics into generic interfaces, to config schema coupling that would confuse a non-CAPCO consumer at startup.

This issue documents every coupling point found, organized by severity, and proposes concrete fixes for each tier. It is intended as both a correctness report and a roadmap.

Crucially, the goal is not merely "add a second grammar as a plug-in." Each grammar must be a first-class citizen with equal architectural standing. Where two grammars share semantic coherence — overlapping concepts that can be translated between systems — Marque must support running both grammars concurrently and translating interpretations across them. A DOE deployment should be able to receive an ISM-marked document and emit DOE AEA-appropriate guidance. An ISM deployment should be able to do the reverse. The same principle applies across any grammar pair with coherent overlaps.

This is not a future roadmap item. The CUI implementation (coming soon) requires real-time translation between NARA CUI markings and ISM/CAPCO classification markings. The coupling described in this issue must be resolved for that work to land cleanly.

The healthcare billing code hypothetical uses a HealthcareScheme (ICD-10, CPT, DRG, CCI edits, insurer-specific codes) as a concrete stress test for each coupling point.

Relationship to #640: Issue #640 covers corpus/tooling grammar-agnosticism and explicitly scopes out marque-scheme, marque-rules, and marque-engine as "already grammar-agnostic." This issue shows that scope exclusion is premature — the Rust crate infrastructure has its own coupling layer that must be addressed before any second grammar can be registered.


Multi-Grammar Concurrency and Cross-Grammar Translation

Before cataloguing the specific coupling points, this section articulates the architectural requirement they block.

First-class grammar citizenship

Each grammar is a peer. There is no "primary grammar" and no "grammar that ships with Marque." CAPCO is Marque's first grammar; it is not Marque's grammar. The infrastructure must treat CAPCO, CUI, NATO, FGI-standalone, AEA, and future grammars identically — same registration path, same rule surface, same diagnostic representation.

Multi-grammar concurrency

A single Engine instance should be capable of holding multiple grammars concurrently and dispatching candidates to each. When a document contains markings from two grammar systems that can coexist on the same document (e.g., a classified document that carries both ISM classification markings and NARA CUI control markings), the engine evaluates both independently and then runs cross-grammar coherence rules over the joint result.

Cross-grammar translation surface

Where two grammars have semantic coherence — a concept in grammar A that maps to a concept in grammar B — Marque should provide a first-class translation path. Translation is not the same as rule evaluation; it is the mapping of one grammar's interpretation onto another grammar's representation.

Concrete examples:

  • ISM classification level → CUI category: TOP SECRET implies a superset of CUI categories that must appear; SECRET//NOFORN implies certain CUI dissemination controls are required.
  • NARA CUI control → ISM dissemination control: CUI//SP-ITAR has an ISM analog in dissemination controls; a translator can propose the ISM equivalent.
  • DOE AEA → ISM AEA axis: RD, FRD, TFNI, ATOMAL are already in ISM; a DOE deployment may prefer DOE-specific guidance, citations, and fix actions over ISM ones.
  • Healthcare (analogous): A hospital's internal procedure codes map onto CPT codes; an insurer's claim adjudication codes map onto DRG groupings. An engine processing both grammars concurrently can validate consistency across the translation boundary and propose corrections in either vocabulary.

Deployment-mode configuration

When two grammars overlap, a deployment configures which grammar's interpretation is favored for diagnostic messaging, fix actions, and citations. A DOE deployment favors DOE AEA guidance; an IC deployment favors CAPCO. But both grammars remain active for validation — the deployment-mode choice affects presentation, not correctness evaluation.

Proposed type surface

The translation requirement implies a new trait surface in marque-scheme:

/// A directional translation from grammar A to grammar B.
/// Implemented by a grammar crate (or a standalone interop crate) that
/// understands the coherence mapping between two grammars.
pub trait Translate<A: MarkingScheme, B: MarkingScheme>: Send + Sync {
    /// Given a marking in grammar A's canonical form, produce a translation
    /// proposal in grammar B's canonical form (or None if no mapping exists).
    fn translate(&self, from: &A::Canonical) -> Option<TranslationProposal<B>>;

    /// Validate that an existing B marking is coherent with an A marking.
    /// Returns coherence diagnostics if they diverge.
    fn coherence_check(
        &self,
        a: &A::Canonical,
        b: &B::Canonical,
        ctx: &CoherenceContext<A, B>,
    ) -> Vec<CoherenceDiagnostic<A, B>>;
}

/// A cross-grammar Engine that holds multiple grammar instances and an
/// optional set of translators between them.
pub struct MultiGrammarEngine<Grammars> { ... }

CoherenceDiagnostic<A, B> carries the offending fields in both grammars' representations, which grammar's interpretation was favored, and proposed translations in each direction. TranslationProposal<B> is a FixIntent-like struct naming the proposed marking in B's terms.

This surface does not exist today. None of the 7 analysed crates contain any structure for grammar-pair registration, translation, or coherence checking.


Tier 1 — Architecture Blockers

These are type-system violations that make it impossible to implement Rule<OtherScheme> or wire a second grammar into Engine without code changes in the core crates.

T1-1: Rule::check takes &CanonicalAttrs (ISM concrete type), not &S::Canonical

File: crates/rules/src/lib.rs ~line 1388

fn check(&self, attrs: &CanonicalAttrs, ctx: &RuleContext<'_>) -> Vec<Diagnostic<S>>;

CanonicalAttrs is marque_ism's owned representation of a parsed marking, carrying fields like sci_markings, sar_markings, dissem_controls, rel_to, etc. The type parameter S: MarkingScheme governs the output (Diagnostic<S>) but NOT the input. A healthcare rule receives a parsed claim, not a CanonicalAttrs. Implementing Rule<HealthcareScheme> is impossible without either faking CanonicalAttrs fields or changing the trait signature.

In the multi-grammar context, this is doubly blocking: a CoherenceRule<A, B> must receive both &A::Canonical and &B::Canonical. The current signature cannot accommodate this even after generification to &S::Canonical.

Proposed fix: Introduce MarkingScheme::Canonical as an associated type; change the signature to:

fn check(&self, attrs: &S::Canonical, ctx: &RuleContext<'_, S>) -> Vec<Diagnostic<S>>;

For cross-grammar coherence rules, introduce a separate CoherenceRule<A, B> trait:

pub trait CoherenceRule<A: MarkingScheme, B: MarkingScheme>: Send + Sync {
    fn check_coherence(
        &self,
        a_attrs: &A::Canonical,
        b_attrs: &B::Canonical,
        ctx: &CoherenceContext<A, B>,
    ) -> Vec<CoherenceDiagnostic<A, B>>;
}

T1-2: RuleContext fields are ISM-native types

File: crates/rules/src/lib.rs

Line Field ISM type
~443 marking_type: MarkingType marque_ism::MarkingType (Portion/Banner/Cab/PageBreak)
~446 zone: Option<Zone> marque_ism::Zone
~448 position: Option<DocumentPosition> marque_ism::DocumentPosition
~483 page_portions: Option<Arc<Box<[CanonicalAttrs]>>> ISM page accumulator
~512 page_marking: Option<Arc<marque_ism::ProjectedMarking>> ISM lattice roll-up
~540 pre_pass_1_attrs: Option<&'a CanonicalAttrs> ISM borrow

These are further compounded by crates/rules/src/lib.rs re-exporting ISM types into the marque_rules public namespace:

pub use marque_ism::{DocumentPosition, MarkingType, Zone};

A HealthcareScheme rule context would need ClaimLineType, ClaimZone, ClaimPosition — these fields would always be None or carry structurally meaningless ISM values.

Proposed fix: Make RuleContext generic — RuleContext<S: MarkingScheme> — with page_portions: Option<Arc<Box<[S::Canonical]>>>, page_marking: Option<Arc<S::Projected>>, and a S::MarkingKind associated type to replace MarkingType. The pub use marque_ism::... re-exports are removed from marque-rules.


T1-3: Engine struct fields hardcoded to CapcoScheme, no multi-grammar support

File: crates/engine/src/engine.rs ~lines 222, 261

pub struct Engine {
    rule_sets: Vec<Box<dyn RuleSet<CapcoScheme>>>,
    scheme: CapcoScheme,
    // ...
}

Engine::new / Engine::with_clock accept Vec<Box<dyn RuleSet<CapcoScheme>>> (not a generic S). A Box<dyn RuleSet<HealthcareScheme>> cannot be stored here. The constructors carry a misleading <S: MarkingScheme> type parameter — it's used only to call scheme.page_rewrites() then immediately discarded:

File: crates/engine/src/engine.rs ~lines 520, 541–567

let bridge_scheme = CapcoScheme::new();
// ...
drop(scheme);  // ← user-supplied scheme silently discarded
Self::with_clock_prepared(config, rule_sets, clock, bridge_scheme, ...)

The Engine<S> generification tracked since PR 3c.B is necessary but not sufficient for the multi-grammar requirement. A generified Engine<S> handles one grammar at a time; running ISM and NARA CUI concurrently requires holding multiple grammar instances. The design must account for this from the start rather than requiring another breaking change after Engine<S> lands.

Proposed fix (two-stage):

  1. Immediate: Generify to Engine<S> per the existing plan; eliminate the silent drop(scheme).
  2. Near-term (CUI): Introduce MultiGrammarEngine that wraps multiple Engine<S> instances and a Vec<Box<dyn Translate<A, B>>> registry; runs single-grammar rules independently then coherence rules over joint output.

T1-4: bridge_constraint_diagnostic maps hardcoded CAPCO label prefixes to rule IDs

File: crates/engine/src/engine.rs ~lines 2264–2279

let rule_id = if v.constraint_label.starts_with("class-floor/") ... {
    RuleId::new("E058")
} else if v.constraint_label.starts_with("sci-per-system/") ... {
    RuleId::new("E059")
} else if ... v.constraint_label == "capco/noforn-conflicts-rel-to" {
    RuleId::new("E053")
} else {
    RuleId::new("E008")
}

In a multi-grammar engine, constraint labels from two grammars would both fall through to E008.

Proposed fix: bridge_constraint_diagnostic should delegate to scheme.constraint_rule_id(label: &str) -> RuleId. In MultiGrammarEngine, each grammar provides its own mapping; the engine routes by constraint namespace prefix.


T1-5: Scanner grammar patterns are hardcoded CAPCO

File: crates/core/src/scanner.rs ~lines 131–149, 168

const BANNER_PREFIXES: &[&[u8]] = &[
    b"TOP SECRET", b"COSMIC TOP SECRET", b"TS//",
    b"SECRET", b"S//", b"CONFIDENTIAL", b"C//",
    b"RESTRICTED", b"UNCLASSIFIED", b"U//", b"//", b"NATO ",
];
const CAB_LABEL: &[u8] = b"Classified By:";

Scanner has no strategy injection point. The NARA CUI scanner would look for CUI// and CUI CONTROLLED header strings; a healthcare scanner would look for ICD-10:, CPT:, DX:. In a multi-grammar scan, the scanner must be able to run all registered grammar strategies over the same input and union candidates.

Proposed fix: Introduce a ScanStrategy trait injectable at Scanner::new. In multi-grammar mode, Scanner accepts a Vec<Box<dyn ScanStrategy>> and tags each candidate with its originating grammar. The CAPCO strategy is unchanged; CUI and healthcare strategies are additive.


T1-6: Parser output type is ISM-native; no multi-grammar parse dispatch

File: crates/core/src/parser.rs ~lines 27–39

The parser imports 20+ ISM attribute types directly. The output ParsedMarking<'src> wraps ParsedAttrs<'src>, an ISM concrete type. There is no grammar dispatch layer — the parser assumes the input is CAPCO.

In multi-grammar mode, a candidate tagged by the scanner as "possible CUI" must be dispatched to the CUI parser, not the CAPCO parser. This requires a ParseStrategy trait parallel to ScanStrategy.

Proposed fix: Introduce ParseStrategy<S: MarkingScheme> producing S::Canonical. The CAPCO parse strategy is unchanged. A grammar-tagged candidate routes to the matching parse strategy.


T1-7: No cross-grammar translation surface exists anywhere in the type system

This is the blocking gap for CUI. The entire codebase contains no:

  • Translate<A, B> trait or analogous structure
  • Grammar coherence concept (CoherenceRule<A, B>)
  • Grammar-pair registry (no way to express "this engine knows ISM and CUI are coherent")
  • CoherenceContext or cross-grammar RuleContext
  • TranslationProposal or cross-grammar FixIntent
  • MultiGrammarEngine or equivalent

This means when CUI lands, CUI ↔ ISM translation logic will either be:

  1. Hardcoded into the CAPCO rule set (coupling in the wrong direction), or
  2. Implemented as an ad-hoc CLI post-processing pass outside the engine, or
  3. Blocked entirely until this surface is designed

Proposed fix: Introduce the translation surface in marque-scheme as described in the Multi-Grammar Concurrency section above. The CUI crate implements Translate<CapcoScheme, CuiScheme> and Translate<CuiScheme, CapcoScheme>. The engine registers these translators and runs coherence checks after single-grammar evaluation.



T1-8: No InputAdapter<S> protocol — structured and schema-typed documents treated as raw text

Files: crates/engine/src/engine.rs, crates/extract/src/extractor.rs

Problem: Every document enters the engine through a single text-pipeline path (scanner → parser → rules). Three input scenarios that a second grammar immediately needs are unsupported:

  1. Structured field — a marking value already extracted from a form field or API payload; scanner + disambiguation are unnecessary, and treating the value as raw text lowers confidence calibration
  2. Schema document — an ISM XML or JSON envelope where markings live in typed attributes, not inline text; the scanner cannot locate them and no adapter protocol exists for schema-aware extraction
  3. Hybrid — a schema envelope (ISM XML metadata) wrapping a binary or text payload that itself contains inline markings; requires two co-operating recognition passes

Without this, an ISM XML → DoD schema translation pipeline (a concrete near-term need) must be built entirely outside the engine. Any grammar that operates on typed structured documents (healthcare billing: X12 EDI, HL7 FHIR payloads) has no entry point at all.

Why it is a Tier 1 blocker (not Tier 2): A grammar whose primary input form is a schema document — not free text — cannot reach the engine's rule evaluation at all, regardless of how well Rule::check is generified.

Proposed fix: Introduce InputAdapter<S> in marque-scheme as a pluggable first stage:

pub trait InputAdapter<S: MarkingScheme>: Send + Sync {
    fn label(&self) -> &'static str;
    fn adapt<'a>(&self, src: &'a [u8], ctx: &InputContext<'a>) -> Result<AdaptedInput<S>, AdapterError>;
}

pub enum AdaptedInput<S: MarkingScheme> {
    Text(TextInput),                        // existing pipeline, unchanged
    Structured(StructuredDocument<S>),      // schema-extracted markings + repair intents
    Hybrid { text: TextInput, schema: StructuredDocument<S> },
}

StructuredDocument<S> carries pre-extracted S::Canonical per layer (metadata headers, payload) plus RepairKind (attribute mutation, not text span replacement), so schema-to-schema corrections are type-safe and don't require round-tripping through byte spans.

Connection to related work:


Tier 2 — Structural Friction (second grammar can compile, but requires structural changes)

T2-1: LintResult / FixResult hardcode CapcoScheme

File: crates/engine/src/output.rs ~lines 47, 172–175

pub struct LintResult {
    pub diagnostics: Vec<Diagnostic<CapcoScheme>>,
}
pub struct FixResult {
    pub applied: Vec<AppliedFix<CapcoScheme>>,
    pub remaining_diagnostics: Vec<Diagnostic<CapcoScheme>>,
}

In multi-grammar output, LintResult must carry diagnostics from multiple grammar types. The type parameter must either be generic (LintResult<S>) or the diagnostic representation must be grammar-erased. Grammar-erased is probably correct for the multi-grammar case: Diagnostic<dyn MarkingScheme> or a tagged union.


T2-2: Sink trait methods not generic; no grammar tag on diagnostics

File: crates/engine/src/pipeline.rs ~lines 43–44

In multi-grammar mode, a Sink must be able to receive diagnostics from multiple grammars and distinguish them. Sink<S> with a grammar tag, or accept_diagnostic_for_grammar(grammar_id: &str, diag: ...).


T2-3: MessageTemplate is a closed enum with CAPCO-only concepts

File: crates/rules/src/message.rs

All 15 variants are CAPCO-specific, not #[non_exhaustive]. A CUI rule cannot add MessageTemplate::RequiredCuiCategoryMissing without bumping MARQUE_AUDIT_SCHEMA.

Proposed fix: Mark #[non_exhaustive]. Introduce a per-grammar message sub-type (CapcoMessageTemplate, CuiMessageTemplate) with MessageTemplate::Grammar { grammar_id: &'static str, variant: u32 } as the generic escape.


T2-4: FeatureId is a closed enum with CAPCO decoder concepts

File: crates/rules/src/confidence.rs ~lines 234–270

FeatureId::StrictContextClassification, SupersededToken, etc. are CAPCO-specific. Not #[non_exhaustive].

Proposed fix: Mark #[non_exhaustive]. Add FeatureId::Grammar { grammar_id: &'static str, variant: u32 } for per-grammar decoder features.


T2-5: default_ruleset() / default_scheme() in marque-engine expose CAPCO types

File: crates/engine/src/lib.rs ~lines 119–130

These are public API of marque-engine, not marque-capco. In a multi-grammar world, "default" is a deployment-configuration concern, not an engine concern.

Proposed fix: Move to marque-capco as capco_default_ruleset() and capco_default_scheme(). marque-engine exports no grammar defaults. A MultiGrammarEngine builder accepts grammar registrations explicitly.


T2-6: DECODER_CITATION_TYPED uses marque_rules::capco(...) in engine core

File: crates/engine/src/engine.rs ~lines 122–123

The R001 decoder diagnostic always carries CAPCO-2016 §A.6 p15. In a multi-grammar engine, the decoder runs per grammar and each grammar provides its own recognition citation.

Proposed fix: scheme.decoder_citation() — provided by each grammar's scheme implementation.


Tier 3 — API Naming Coupling in marque-scheme

These are cases where the names of types, variants, and methods in the generic infrastructure encode CAPCO concepts. In a multi-grammar context, this friction compounds: both ISM and CUI grammar implementors share the same trait surface, and CAPCO-named methods cause active confusion (a CUI rule author sees render_portion() and render_banner() where they need render_item() and render_summary()).

T3-1: Zone::Cab — Classification Authority Block in a generic enum

File: crates/scheme/src/recognizer.rs ~line 56

A CUI document has no CAB. Neither does a healthcare claim.

Proposed fix: Add Zone::Custom(&'static str) for scheme-specific zones, or make Zone an associated type on MarkingScheme. Short-term: rename Cab to StructuralBlock.


T3-2: ParseContext::classification_floor — name and encoding pinned to ISM

File: crates/scheme/src/recognizer.rs ~lines 95–110

The field name and docs map u8 values to ISM classification levels.

Proposed fix: Rename to rank_floor with grammar-neutral semantics: "minimum rank (scheme-defined ordering) already established in this recognition context."


T3-3: OwnerProducerKind::Nato and ::Fgi — IC ontology without Custom escape

File: crates/scheme/src/vocabulary.rs ~lines 66–79

CMS and AMA don't fit Nato or Fgi. NARA doesn't either.

Proposed fix: Add OwnerProducerKind::Custom(&'static str), mark #[non_exhaustive]. Rename NatoInternationalBody, FgiForeignGovernment with CAPCO-specific usage as examples.


T3-4: FormSet fields — "portion" and "banner" in the generic vocabulary API

File: crates/scheme/src/vocabulary.rs ~lines 215–226

ICD-10 has full_description and short_description. CUI has a short title and a long marking string. Neither maps to "portion."

Proposed fix: Rename to short_form, long_form, abbreviated_form.


T3-5: FormKind::IsmDescriptionTitle — "ISM" in a public generic enum

File: crates/scheme/src/vocabulary.rs ~line 239

Proposed fix: Rename to StandardDescriptionTitle.


T3-6: Vocabulary::is_fdr_dissem() — US IC Foreign Disclosure and Release on the generic trait

File: crates/scheme/src/vocabulary.rs ~lines 329–385

FD&R is a US IC-specific policy concept. A CUI grammar or healthcare grammar must provide a false stub. Dead code that cannot be removed without a trait break.

Proposed fix: Move to an IcMarkingVocabulary<S> sub-trait. CapcoScheme implements both; other grammars implement only Vocabulary<S>.


T3-7: EmissionForm::Portion, BannerTitle, BannerAbbreviation

File: crates/scheme/src/render_context.rs ~lines 133–151

Proposed fix: Rename to ShortForm, LongForm, AbbreviatedForm.


T3-8: MarkingScheme::render_portion(), render_banner(), project_banner()

File: crates/scheme/src/scheme.rs ~lines 402–405

In a multi-grammar engine, both ISM and CUI grammars implement this trait. A CUI grammar implementor calling self.render_banner() to render a CUI header string is actively confusing.

Proposed fix: Rename to render_item(), render_summary(), project_summary(). The project_banner back-compat shim is deprecated.


Tier 4 — Entry-Point and Config Wiring

These are isolated changes confined to binary/config crates. Straightforward once the tier-1 and tier-2 work is done.

T4-1: Config.capco: CapcoConfig on the shared config struct

File: crates/config/src/lib.rs ~line 189

A billing grammar consumer or a NARA CUI consumer must supply an ISM schema version string they don't understand. validate_schema_version hard-fails against marque_ism::generated::values::SCHEMA_VERSION.

Proposed fix: Config.grammar_schema (generic). Schema-version validation supplied by each registered grammar's scheme.validate_schema_version(version: &str) -> Result<()>. TOML section [grammar] or per-grammar [grammars.capco] / [grammars.cui]. Error message becomes grammar-specific.


T4-2: CLI and server entry points hardcode capco_rules()

Files: marque/src/main.rs, crates/server/src/main.rs

No mechanism to substitute a different grammar or add a second grammar at build or runtime.

Proposed fix: Grammar registration at startup via build features or a pluggable grammar registry. Short-term: capco_engine(config) -> Engine helper in marque-capco. Longer-term: a grammar registry that can hold multiple grammars and their translators, driven by config.


T4-3: WASM compute_banner / generate_cab exports are CAPCO domain operations

File: crates/wasm/src/lib.rs

CAPCO-specific domain functions (compute_banner, generate_cab with EO 13526 defaults) should not live in the shared marque-wasm target.

Proposed fix: Feature-gate or split. marque-wasm exports grammar-neutral functions only. CAPCO-domain exports move to a marque-capco-wasm build target.


T4-4: Server /v1/health returns marque_capco::SCHEMA_VERSION

File: crates/server/src/lib.rs ~line 677

A CUI or healthcare client receives an ISM schema version identifier.

Proposed fix: engine.grammar_schema_version() returning the registered grammar's version. In multi-grammar mode, schema_versions: { "capco": "ISM-v2022-DEC", "cui": "2024.1" }.


T4-5: Citation type: AuthoritativeSource::Capco2016 is the only real source

File: crates/rules/src/citation.rs

SectionLetter A–H explicitly restricted to CAPCO-2016 normative range. capco(), capco_section(), capco_table() helpers exported from marque_rules::*. A CUI rule cites NARA CUI Marking Handbook; a healthcare rule cites CMS Claims Processing Manual Chapter 12.

Proposed fix: Mark AuthoritativeSource #[non_exhaustive]; add AuthoritativeSource::Custom { name: &'static str }. Move CAPCO-specific helpers to marque_capco::citation. marque_rules exports only Citation::config() and Citation::engine_internal().


T4-6: Render functions type-specialized to CapcoScheme

Files: marque/src/render.rs, crates/wasm/src/lib.rs

All render functions take Diagnostic<CapcoScheme>. Grammar-specific FactRef / OpenVocabRef matching must be grammar-specific anyway; the generic frame can be generified.

Proposed fix: Generic render frame (pub fn<S: MarkingScheme + RenderScheme>(...)). Grammar-specific rendering (CapcoOpenVocabRef matching) moves to marque-capco's render module.


Healthcare Billing Code Hypothetical

A HealthcareScheme validating CMS billing claims with two sub-grammars: StandardBillingScheme (ICD-10 / CPT / DRG) and InsurerScheme (insurer-specific procedure codes and modifiers that map onto CPT codes with insurer-specific equivalences). This mirrors the ISM ↔ CUI scenario exactly: two grammars with coherent overlapping concepts that require cross-grammar translation.

What would work with no changes ✅

Marque abstraction Healthcare use
FuzzyVocabMatcher CPT code fuzzy correction ("99214" vs "99241") — fully reusable
Confidence { recognition, rule } Per-code confidence scoring
FixIntent<S> { FactAdd, FactRemove } Adding/removing procedure codes or modifiers
AggregationOp::Union Rolling up all diagnosis codes on an encounter
Constraint::Conflicts CCI bundling edits (CPT A cannot be billed with CPT B same date)
Constraint::Requires Medical necessity (CPT requires qualifying ICD-10 diagnosis)
ClosureRule "If modifier 25 present, add E&M code requirement"
lattice::JoinSemilattice Service-line → claim rollup
Rule ID format (B###, CCI-###) Fully opaque strings, no code changes needed

What would fail or be confusing ❌

Coupling point Healthcare consequence
Rule::check(attrs: &CanonicalAttrs) Cannot implementCanonicalAttrs has no claim_lines, procedure_codes fields
RuleContext::page_portions: Arc<Box<[CanonicalAttrs]>> Always None — wrong type, dead weight
Zone::Cab No Classification Authority Block on a claim
MessageTemplate::BannerRollupMismatch No banner rollup in billing
MessageTemplate::UnpublishedSciControl No SCI controls in healthcare
Vocabulary::is_fdr_dissem() Must return false always; dead code
OwnerProducerKind::Nato CMS/AMA forced into Organization
FormSet::portion "Portion mark" for CPT 99214 is meaningless
EmissionForm::BannerTitle No banner in a claim
render_portion() / render_banner() Wrong naming for rendering a service line vs. claim summary
Config.capco: CapcoConfig Must supply an ISM schema version string at startup
validate_schema_version Hard-fails unless ISM schema version matches compiled constant
compute_banner WASM export Dead export
Server schema_version: marque_capco::SCHEMA_VERSION Healthcare clients receive ISM version on health poll
Citation::capco(...) in marque_rules::* CAPCO citation constructors appear in IDE for healthcare rule authors

What the multi-grammar surface enables (the translation scenario)

A hospital billing system uses internal procedure codes (HOSP-0042-LAPCHOL = laparoscopic cholecystectomy). A payer adjudicates on CPT codes (47562). Without translation infrastructure, validating a claim requires custom glue code per hospital-payer pair. With Marque's translation surface:

HospitalScheme::Canonical → Translate<HospitalScheme, CptScheme> → CptScheme::Canonical
    ↓ CoherenceRule detects: HOSP-0042-LAPCHOL maps to CPT 47562
    ↓ ClaimLineType::Procedure matches on both sides
    ↓ InsurerScheme::validate(cpt_canonical) → CoherenceDiagnostic with TranslationProposal

The insurer's rules run against CptScheme::Canonical; the hospital's internal rules run against HospitalScheme::Canonical; cross-grammar coherence rules validate the translation boundary. This is exactly the structure CUI ↔ ISM requires: CUI rules run against CuiScheme::Canonical, ISM rules run against CapcoScheme::Canonical, coherence rules validate that the CUI sensitivity designation is consistent with the ISM classification level.


Severity / Effort Matrix

Tier Count Effort Can second grammar land without fixing? CUI blocker?
T1-1 to T1-6 — Architecture blockers 6 High (trait redesign) ❌ No ✅ Some
T1-7 — No translation surface 1 High (new trait design) ⚠️ Grammar registers, no translation ❌ Yes — CUI blocks here
T2 — Structural friction 6 Medium (generification) ❌ No (compile errors) ✅ Yes
T3 — Naming coupling 8 Low (rename + #[non_exhaustive]) ⚠️ Yes, but confusing No
T4 — Entry-point/config 6 Low–Medium (isolated) ⚠️ Yes, with manual wiring Partial

T1-7 is highlighted because it is the CUI near-term blocker. The CUI crate placeholder (crates/cui/) exists in the workspace but has no Cargo.toml and no source precisely because the translation surface it needs doesn't exist. T1-7 must be designed before CUI implementation begins; the other T1 items and T2 items must land before CUI can wire into the engine.


Notes on marque-scheme Structural Soundness

The lattice math, MarkingScheme trait core, Constraint, Category, ClosureRule, Codec, Span are cleanly abstract. marque-scheme does not depend on marque_capco or marque_ism at the import level — this is correct.

The coupling in marque-scheme is naming/conceptual (T3-1 through T3-8), not dependency-graph. However, in a multi-grammar context this matters more: both ISM and CUI grammar implementors share the same MarkingScheme trait surface. A CUI rule author sees render_portion() and render_banner() alongside Zone::Cab and ParseContext::classification_floor in their IDE. The confusion is proportional to the number of grammars implementing the trait. T3 fixes become higher-priority as the grammar count grows.

The translation surface (T1-7) belongs in marque-scheme. It is the correct home for Translate<A, B>, CoherenceRule<A, B>, CoherenceContext, and CoherenceDiagnostic because these are grammar-neutral infrastructure types. The concrete implementations (impl Translate<CapcoScheme, CuiScheme>) live in the grammar crates or a dedicated interop crate (marque-capco-cui-interop).


See Also

Metadata

Metadata

Assignees

No one assigned

    Labels

    engineEngine pipeline, scanner/parser, RuleContext/Severity infrastructure, and cross-domain core surfaceenhancementNew feature or requestpost-refactorThing that can wait until after the current big refactor

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions