Skip to content

feat(docs): per-token authoritative help text (hover-on-token, API consumable) #255

@bashandbone

Description

@bashandbone

Background

Operators frequently encounter a token they don't recognize — EXDIS, NODIS, ORCON, a SCI control system they don't normally work with — and have to either know the rule from memory or open CAPCO-2016 and search. Editor integrations should be able to surface the authoritative help text inline (hover, tooltip, palette docs).

Phase 5's Vocabulary<S> already exposes the structural metadata (authority, owner/producer, URN, deprecation status). What's missing is the prose: the explanation a human reader actually needs.

Proposed shape

Extend the per-token metadata generated at build time to include extracted help text from crates/capco/docs/CAPCO-2016.md (the authoritative source per Constitution Principle VIII). Each token (and each category, e.g., the SCI control system class as a whole) gets:

  • Short description (one line, ~80 chars) — what it is.
  • Long description (paragraph) — when to use it, constraints, common errors.
  • Citation (§X.Y pNN) — the exact CAPCO-2016 location it was extracted from.
  • Examples (where CAPCO-2016 provides them) — verbatim portion / banner forms.

Build-time extraction:

  • crates/capco/build.rs (or a new tools/capco-doc-extract/) parses CAPCO-2016 sections by anchored heading and pulls the prose adjacent to each token mention.
  • Manual override table (crates/capco/data/help_overrides.toml) for cases where the prose extraction can't cleanly bound a description (CAPCO-2016 isn't always structured as one-token-per-section).
  • Generated OUT_DIR/help_text.rs with a &'static [(TokenId, HelpEntry)] table.
  • Vocabulary<S> gains a help(token: TokenId) -> Option<&HelpEntry> method.

API surface

Build on the previous issue's vocabulary API:

GET /v1/vocab/trigraph/:code         → name + help text
GET /v1/vocab/tetragraph/:code       → members + help text
GET /v1/vocab/sci/:control           → help text + authority
GET /v1/vocab/dissem/:control        → help text + replacement (if deprecated)

WASM consumers expose the same data locally for hover-on-token.

Constitution Principle VIII compliance

This is the one feature where citation discipline matters most — every help text entry is a quotation (paraphrase) of CAPCO-2016. The extraction pipeline:

  1. Build-time verification: every help entry MUST carry a citation that resolves to a real anchor in crates/capco/docs/CAPCO-2016.md. Build fails if a citation doesn't resolve.
  2. No paraphrase drift: the long-description prose should be near-verbatim from CAPCO-2016 (within fair-use bounds for the single-line short description). A future LLM-generated rewrite isn't allowed without explicit human verification per Principle VIII.
  3. Source revisions: when CAPCO-2016 is succeeded (a future revision), the help-text table is regenerated as part of the planned migration, not silently refreshed.
  4. Override audit: the help_overrides.toml file is human-curated; every override must include a justification in a comment naming the section of CAPCO-2016 it derives from.

Acceptance

  • Build-time extraction of help text from crates/capco/docs/CAPCO-2016.md.
  • Per-token short + long description + citation + examples.
  • Manual override table for cases extraction can't handle cleanly.
  • Build fails on unresolvable citations.
  • Vocabulary<S>::help API method.
  • WASM-safe (no runtime I/O).
  • API endpoints in marque-server (composes with #issue for vocabulary lookup API).
  • Coverage target: ≥90% of tokens in marque-capco::vocab have a non-empty help entry. Tokens without one fall through to the structural metadata only.

Open questions

  • Long descriptions can be multi-paragraph; how do we keep the binary size reasonable? Consider lazy-loading via include_str! of pre-extracted snippets, vs constants. Probably constants — &'static str slices into a single embedded blob.
  • Should help entries cite line numbers or section anchors? Memory says "citations use page numbers, not line numbers" (commit b340bec). Use §X.Y pNN.
  • Out-of-scope: localization. CAPCO-2016 is English; help text is English. Future non-US schemes (NATO, etc.) will have their own authoritative sources in their original languages.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    capco/ismCAPCO classification markings, ODNI ISM schema, and the marque-capco / marque-ism rule chaindocumentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions