Skip to content

Conversation

@flyingrobots
Copy link
Owner

No description provided.

This commit addresses 8 specific issues raised during review of the initial ADR-0004 documentation. The changes improve clarity, consistency, and technical precision across the specification.

Fixes include:
- **FEATURES.md**:
  - Rewrote acceptance criteria for F9-US-DEV to be specific and verifiable.
  - Removed the superseded F6 feature to avoid confusion with F9.

- **SPEC.md**:
  - Corrected Mermaid diagram type casing ( -> ) to align with JSON standards.
  - Clarified that private blob decryption during pointer resolution is conditional, not assumed.

- **TECH-SPEC.md**:
  - Removed redundant participant aliases in the projection sequence diagram.
  - Replaced the unconventional  endpoint with a more RESTful  URL.
  - Improved diagram precision by changing 'node' to 'field path'.
  - Specified JWS with a detached payload as the required authentication mechanism for pointer resolution.
Updated the acceptance criteria for F9-US-SEC in FEATURES.md to explicitly mention the  and  commit trailers, improving clarity for developers.
…ional HTTP Message Signatures profile; tighten determinism (RFC 8785 JCS + key ordering); clarify namespaces and error taxonomy; add pointer rotation trailer\nschema(privacy): add ciphertext_digest + extensions; bump to draft 2020-12\nschema(policy): add optional privacy.classes + rules\nexamples: add privacy_min and updated opaque_pointer example\nscripts: validate privacy schema/examples in CI\nmake: fix tab indentation for lint/fix targets\nchore: add gatos-privacy crate with OpaquePointer type and notes
…onse digest headers reflect response body\nschema(privacy): allow ciphertext-only pointers via anyOf; keep kind/algo/location/capability required\ndocs(ADR-0004): fix colon in class diagram;\ndocs(ADR-0003): PoC storage level to MUST
…add trailer/anchors/notes; schemas + examples; regenerate diagrams

- ADR-0005: comprehensive Shiplog spec
  - Canonicalization: RFC 8785 JCS; Content-Id = blake3(JCS(envelope))
  - Namespaces: refs/gatos/shiplog/<topic>/head; add shiplog-compat (refs/_shiplog)
  - Commit body: headers + '---' + JSON trailer (v1)
  - Anchors & Notes: normative sections (+ refs)
  - Error taxonomy aligned with ledger-kernel (AppendRejected, TemporalOrder, PolicyFail, SigInvalid, DigestMismatch)
  - Mermaid diagrams and CLI examples

- Schemas (+ examples):
  - shiplog/event_envelope, consumer_checkpoint
  - shiplog/deployment_trailer (v1), shiplog/anchor
  - privacy/opaque_pointer examples updated

- SPEC Section 8 and TECH-SPEC Section 7 updated to match ADR-0005
  - Include compat ingestion (jq -S -> JCS), anchors/notes, importer outline

- FEATURES: add F6 — Shiplog Event Stream

- CI/scripts: validate new schemas/examples in scripts/validate_schemas.sh

- Diagrams: regenerate docs/diagrams/generated (no content changes besides new/updated diagrams)

Note: private comparison matrix is in .obsidian/git-tech-matrix.md (gitignored by design).
…chemas to Makefile/schema targets; AJV runner fallback in validator script; decisions index includes ADR-0005
…mples/ADR; add ns==Topic invariant; clean CLI samples; consolidate error taxonomy
…ace duplicated taxonomy with problem+json example
- Change sample ref path to demo/head and label subgraph as (sample).
- Make CLI read example copy-pasteable (valid JSON in-line; ellipsis outside JSON).
- Add CLI invariant: --topic MUST equal envelope.ns and ref segment.
- Note trailer repo_head is top-level only.
- Add numeric discipline MUST to canonicalization section.
- Fix schema refs description to use blake3:<64-hex>.
…tree; include checker script and fixture; wire into Makefile
…pline; demote AES-256-GCM to non-normative interop profile
…nts, CLI flags, and lowercase digest examples; normalize schema paths
…ad non-duplication, Hashing Law verifier note, anchor signing key guidance, and AAD Components example
@coderabbitai
Copy link

coderabbitai bot commented Nov 11, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Summary by CodeRabbit

  • New Features

    • Added hybrid privacy model with opaque pointers to protect sensitive data in public projections
    • Introduced Shiplog: append-only, parallel event stream with anchors and consumer checkpoints
    • Added privacy policy framework supporting configurable data redaction and encryption rules
    • Implemented automatic private blob extraction and pointer-based data separation
  • Documentation

    • Added ADR-0004 specification for hybrid privacy architecture
    • Added ADR-0005 specification for event streaming system
    • Expanded privacy use cases and technical specifications

Walkthrough

Adds a new gatos-privacy crate implementing opaque-pointer types and runtime validation, introduces a hybrid privacy model (ADR-0004) and Shiplog (ADR-0005), expands JSON schemas and examples for privacy and shiplog, updates schema validation tooling and Makefile targets, and adds validation/killcheck scripts and tests.

Changes

Cohort / File(s) Summary
Workspace & Package
Cargo.toml, crates/gatos-privacy/Cargo.toml
Registers crates/gatos-privacy in the workspace and adds crate manifest with serde, serde_json, blake3, hex, anyhow (workspace = true) and thiserror.
Privacy crate code & tests
crates/gatos-privacy/src/lib.rs, crates/gatos-privacy/tests/verified.rs, crates/gatos-privacy/tests/pointer_schema.rs, crates/gatos-privacy/README.md
Adds OpaquePointer type, Kind/Algo enums, PointerError, validate() logic enforcing digest/ciphertext rules (esp. low-entropy), VerifiedOpaquePointer wrapper that validates on Deserialize, Deref impl, and unit tests + crate README.
JSON Schemas (privacy & shiplog)
schemas/v1/privacy/opaque_pointer.schema.json, schemas/v1/policy/governance_policy.schema.json, schemas/v1/shiplog/*.schema.json
New schemas: opaque_pointer with conditional low-entropy rules, governance policy privacy block, and shiplog schemas (anchor, consumer_checkpoint, deployment_trailer, event_envelope).
Examples
examples/v1/privacy/*, examples/v1/policy/privacy_min.json, examples/v1/shiplog/*
Adds example opaque pointers (valid/invalid), privacy policy example, and shiplog example payloads (anchors, checkpoints, events, trailer).
Docs & ADRs
docs/FEATURES.md, docs/SPEC.md, docs/TECH-SPEC.md, docs/USE-CASES.md, docs/decisions/ADR-0003/DECISION.md, docs/decisions/ADR-0004/DECISION.md, docs/decisions/ADR-0005/DECISION.md, docs/decisions/README.md
Introduces Hybrid Privacy ADR (ADR-0004) and Shiplog ADR (ADR-0005); rewrites privacy/opaque-pointer spec, projection/resolution flow, private storage layout, and governance PoC wording.
Validation & CI tooling
Makefile, scripts/validate_schemas.sh
Switches schema validation to AJV_RUNNER abstraction (npx or Docker), updates Makefile targets to ajv-cli@5 compile/validate patterns, adds schema-negative and kill-check targets.
Killcheck & helper scripts
scripts/killcheck/schema_headers.sh, scripts/killcheck/error_casing.sh, scripts/killcheck/ulid_reference.sh, scripts/shiplog/check_topic_ns.js
Adds scripts enforcing docs header policy, error-code casing, ULID external reference, and a small Node.js topic/namespace check.
Misc tests & CI fixtures
examples/..., schemas/... (new files)
Numerous added example fixtures and schema files to exercise new privacy and shiplog validation paths.

Sequence Diagram(s)

sequenceDiagram
    participant Echo as Echo (State Engine)
    participant Policy as Policy Service
    participant PrivateStore as PrivateStore (FS/S3)
    participant Ledger as Ledger (Git refs)

    rect rgba(220,245,220,0.25)
    Note over Echo,Ledger: Privacy Projection (ADR-0004)
    end

    Echo->>Policy: Fetch privacy rules
    Policy-->>Echo: Rules (pointerize/redact, class, capability)
    loop For each matched field
        Echo->>Echo: Extract private value
        Echo->>Echo: Encrypt (XChaCha20-Poly1305, nonce/AAD rules)
        Echo->>PrivateStore: Store ciphertext at digest location
        Echo->>Echo: Construct OpaquePointer (ciphertext_digest, location, capability, extensions)
        Echo->>Echo: Replace field with pointer
    end
    Echo->>Ledger: Commit PublicState + trailers
    Ledger-->>Echo: Commit oid / refs
Loading
sequenceDiagram
    participant Client
    participant Gatosd as Gatosd (Resolver)
    participant Policy as Policy
    participant PrivateStore as PrivateStore

    rect rgba(245,235,220,0.25)
    Note over Client,Gatosd: Opaque Pointer Resolution (ADR-0004)
    end

    Client->>Gatosd: POST /gatos/private/blobs/resolve (JWT, pointer, who)
    Gatosd->>Gatosd: Verify JWT & auth
    alt auth fail
        Gatosd-->>Client: 401/403
    else auth ok
        Gatosd->>Policy: Validate capability
        Policy-->>Gatosd: capability ok
        Gatosd->>PrivateStore: Fetch ciphertext from pointer.location
        PrivateStore-->>Gatosd: Ciphertext
        Gatosd->>Gatosd: Decrypt, verify blake3(plaintext) == pointer.digest
        alt digest mismatch
            Gatosd-->>Client: 422 DigestMismatch
        else
            Gatosd-->>Client: 200 OK (plaintext)
        end
    end
Loading
sequenceDiagram
    participant Producer
    participant Shiplog as Shiplog Service
    participant Git as Git Storage
    participant Consumer

    rect rgba(220,230,255,0.25)
    Note over Producer,Consumer: Shiplog append & consume (ADR-0005)
    end

    Producer->>Shiplog: append_event(topic, type, payload)
    Shiplog->>Shiplog: Validate/produce ULID (monotonic per-topic)
    Shiplog->>Shiplog: Canonicalize (RFC 8785 JCS) -> blake3(content)
    Shiplog->>Git: Store envelope at refs/gatos/shiplog/{topic}/{ulid}
    Shiplog->>Git: Update trailer/anchor refs
    Shiplog-->>Producer: OK (ulid, commit_oid)

    Consumer->>Shiplog: shiplog.tail(topic)
    Shiplog-->>Consumer: Anchor (head, label, created_at)
    Consumer->>Shiplog: shiplog.read(topic, checkpoint)
    Shiplog->>Git: Fetch events after checkpoint
    Shiplog-->>Consumer: [EventEnvelope,...]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–75 minutes

Areas needing extra scrutiny:

  • ADR-0004 projection determinism, nonce derivation for XChaCha20-Poly1305, and AAD semantics.
  • Parity between Rust OpaquePointer::validate() and JSON Schema conditional rules for low-entropy class.
  • VerifiedOpaquePointer Deserialize error mapping and public-surface invariants.
  • ADR-0005 ULID monotonic ordering guarantees and canonicalization (RFC 8785) consistency across implementations.
  • AJV_RUNNER Docker vs npx behavior and Makefile target correctness.
  • Killcheck regexes (avoid false positives/negatives) and scripts' strict bash options.

Poem

A pointer born in secret shade,
wrapped in blake3, by rules well-laid.
ADRs decree the path it takes,
tests and schemas guard the stakes.
Shiplog hums while privacy wakes — 🎉

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive No description provided, but the title is sufficiently specific and the changeset is clearly related to documented ADR-0005 work; however, a description would provide crucial context for reviewers. Add a pull request description explaining the scope: ADR-0005 introduction, schema files, example payloads, and validation script updates. Include rationale for schema changes and cross-references to related ADRs (0003, 0004).
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the primary objective: ADR-0005 documentation for Shiplog with schema validation alignment and enforcement checks.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch adr-0005-prep

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…hema-compliant) — currently fails due to required digest field
…nvariants for low-entropy class; update tests
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 29

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
scripts/validate_schemas.sh (1)

74-96: Node-less hosts still crash the “docker fallback”

Lines 74‑96 call node -e … directly. On machines without Node.js (the whole reason we fall back to Docker), this exits with “node: command not found” before any AJV work happens, so CI and dev shells without Node remain broken. Please gate these helpers the same way you did AJV—e.g. set up a NODE_RUNNER that points to node locally or docker run … node:20 node otherwise—and use it here.

docs/SPEC.md (2)

683-690: Approval expiration field well‑designed; implicit inheritance semantics should be clarified.

The optional Expires-At field allows granular approval validity windows, with the constraint that approvals cannot exceed the proposal expiration. Clear rule.

Minor refinement: Explicitly document the implicit expiration behavior in section 20 validation rules:

An Approval's effective expiration is min(Approval.Expires-At, Proposal.Expire-At).
If Approval.Expires-At is absent, the approval expires at Proposal.Expire-At.

This ensures implementations converge on the same comparison logic.


700-708: Proof‑Of‑Consensus path and trailer validation well‑specified; canonicalization assumption should be explicit.

The PoC envelope storage path (refs/gatos/audit/proofs/governance/<proposal-id>) and trailer integrity check (blake3(envelope_bytes)) are clearly defined.

Minor clarity: Line 708 should explicitly state that envelope_bytes refers to the RFC 8785 JCS-canonicalized, UTF-8-encoded PoC envelope (to align with canonicalization rules at lines 364–366).

Apply this diff:

- PoC envelope MUST be stored canonically under `refs/gatos/audit/proofs/governance/<proposal-id>`; the Grant's `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)`.
+ PoC envelope MUST be stored canonically under `refs/gatos/audit/proofs/governance/<proposal-id>` as RFC 8785 JCS-canonicalized UTF-8 bytes. The Grant's `Proof-Of-Consensus` trailer MUST equal `blake3(envelope_bytes)` where `envelope_bytes` is the canonical encoding.
♻️ Duplicate comments (1)
examples/v1/policy/privacy_min.json (1)

16-16: Duplicate placeholder URI issue (same as opaque_pointer_min.json).

This example reuses the same malformed 51-character ed25519 key from opaque_pointer_min.json line 7. Refer to that review comment for details.

This duplication suggests copy-paste without validation. Standardize placeholder values across examples.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffa82b0 and a0b49cc.

⛔ Files ignored due to path filters (56)
  • Cargo.lock is excluded by !**/*.lock
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_10.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_11.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_12.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_13.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_14.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_15.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_16.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_17.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_18.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_19.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_20.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_21.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_3.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_4.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_5.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_6.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_7.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_8.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_SPEC__0679d036ea__mermaid_9.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_10.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_11.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_12.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_13.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_14.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_15.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_16.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_3.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_4.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_5.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_6.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_7.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_8.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_TECH-SPEC__15850d53f4__mermaid_9.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0001_DECISION__b1ba19490c__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0001_DECISION__b1ba19490c__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0001_DECISION__b1ba19490c__mermaid_3.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0002_DECISION__a21cc79f92__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0002_DECISION__a21cc79f92__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0003_DECISION__3c4445d569__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0003_DECISION__3c4445d569__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0004_DECISION__12d7f53080__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0004_DECISION__12d7f53080__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0004_DECISION__12d7f53080__mermaid_3.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0004_DECISION__12d7f53080__mermaid_4.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0005_DECISION__03ffbed862__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_decisions_ADR-0005_DECISION__03ffbed862__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_diagrams_api_endpoints__578bd81e4d__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_diagrams_api_endpoints__578bd81e4d__mermaid_2.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_diagrams_architecture__105fc24d87__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_diagrams_data_flow__559f0c180d__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
  • docs/diagrams/generated/docs_diagrams_state_management__86fd1ffb65__mermaid_1.svg is excluded by !**/*.svg, !**/generated/**
📒 Files selected for processing (35)
  • Cargo.toml (1 hunks)
  • Makefile (2 hunks)
  • crates/gatos-privacy/Cargo.toml (1 hunks)
  • crates/gatos-privacy/src/lib.rs (1 hunks)
  • crates/gatos-privacy/tests/pointer_schema.rs (1 hunks)
  • docs/FEATURES.md (1 hunks)
  • docs/SPEC.md (6 hunks)
  • docs/TECH-SPEC.md (6 hunks)
  • docs/USE-CASES.md (1 hunks)
  • docs/decisions/ADR-0003/DECISION.md (1 hunks)
  • docs/decisions/ADR-0004/DECISION.md (1 hunks)
  • docs/decisions/ADR-0005/DECISION.md (1 hunks)
  • docs/decisions/README.md (1 hunks)
  • examples/v1/policy/privacy_min.json (1 hunks)
  • examples/v1/privacy/opaque_pointer_min.json (1 hunks)
  • examples/v1/privacy/pointer_low_entropy_invalid.json (1 hunks)
  • examples/v1/privacy/pointer_low_entropy_min.json (1 hunks)
  • examples/v1/shiplog/anchor_min.json (1 hunks)
  • examples/v1/shiplog/checkpoint_commit_only_invalid.json (1 hunks)
  • examples/v1/shiplog/checkpoint_min.json (1 hunks)
  • examples/v1/shiplog/checkpoint_ulid_only_invalid.json (1 hunks)
  • examples/v1/shiplog/event_min.json (1 hunks)
  • examples/v1/shiplog/event_mismatch_ns.json (1 hunks)
  • examples/v1/shiplog/trailer_min.json (1 hunks)
  • schemas/v1/policy/governance_policy.schema.json (1 hunks)
  • schemas/v1/privacy/opaque_pointer.schema.json (1 hunks)
  • schemas/v1/shiplog/anchor.schema.json (1 hunks)
  • schemas/v1/shiplog/consumer_checkpoint.schema.json (1 hunks)
  • schemas/v1/shiplog/deployment_trailer.schema.json (1 hunks)
  • schemas/v1/shiplog/event_envelope.schema.json (1 hunks)
  • scripts/killcheck/error_casing.sh (1 hunks)
  • scripts/killcheck/schema_headers.sh (1 hunks)
  • scripts/killcheck/ulid_reference.sh (1 hunks)
  • scripts/shiplog/check_topic_ns.js (1 hunks)
  • scripts/validate_schemas.sh (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/validate_schemas.sh (1)
crates/gatos-privacy/src/lib.rs (1)
  • validate (46-72)
🪛 LanguageTool
docs/decisions/ADR-0005/DECISION.md

[grammar] ~27-~27: The verb ‘append’ does not usually follow articles like ‘an’. Check that ‘append’ is spelled correctly; using ‘append’ as a noun may be non-standard.
Context: ...xt / Problem Many integrations require an append‑only stream rather than only snapshot s...

(A_INFINITIVE)


[uncategorized] ~104-~104: A comma may be missing after the conjunctive/linking adverb ‘Thus’.
Context: ...g commit, not a working‑directory file. Thus the object hash of the blob equals the ...

(SENT_START_CONJUNCTIVE_LINKING_ADVERB_COMMA)


[uncategorized] ~222-~222: Loose punctuation mark.
Context: ...y profile: - ref_root = refs/_shiplog; Journals: journal/<topic>; Anchors: `...

(UNLIKELY_OPENING_PUNCTUATION)

docs/decisions/ADR-0004/DECISION.md

[uncategorized] ~31-~31: Loose punctuation mark.
Context: ...blic and private data. * Sh_Public: The category of shapes containing only ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~32-~32: Loose punctuation mark.
Context: ... and opaque pointers. * Sh_Private: The category of shapes containing only ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~113-~113: Loose punctuation mark.
Context: ...ard-compatible } ``` * digest: The content-address of the private plai...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~114-~114: Loose punctuation mark.
Context: ...rivate worlds. * ciphertext_digest: The content-address of the stored ciphe...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~115-~115: Loose punctuation mark.
Context: ...intext digest publicly. * location: A URI indicating where to resolve the b...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~117-~117: Loose punctuation mark.
Context: ...the GATOS trust graph. * https://..., s3://..., ipfs://...: Standard dist...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~118-~118: Loose punctuation mark.
Context: ...d distributed storage. * file:///...: For local development and testing. * **...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~119-~119: Loose punctuation mark.
Context: ...elopment and testing. * capability: A reference identifying the authorizati...

(UNLIKELY_OPENING_PUNCTUATION)


[style] ~128-~128: Consider removing “of” to be more concise
Context: ... Opaque Pointer, the AEAD AAD MUST bind all of the following, in order, as UTF‑8 bytes: `...

(ALL_OF_THE)


[style] ~136-~136: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...ain separation (e.g., length‑prefixing) prior to supplying them as AAD to the AEAD algor...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~144-~144: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...acy Policy** (.gatos/policy.yaml). 3. It traverses the UnifiedState tree, appl...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~145-~145: Loose punctuation mark.
Context: ...torpointerizerules. *redact`: The field is removed from the public st...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~146-~146: Loose punctuation mark.
Context: ...rom the public state. * pointerize: The field's value is stored as a privat...

(UNLIKELY_OPENING_PUNCTUATION)


[style] ~151-~151: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...MUST be canonicalized with RFC 8785 JCS prior to hashing. * When non‑JSON maps are mater...

(EN_WORDINESS_PREMIUM_PRIOR_TO)

docs/FEATURES.md

[typographical] ~202-~202: Two consecutive dots
Context: ......** | Security/Compliance | | I want.. | to audit the separation of public a...

(DOUBLE_PUNCTUATION)

docs/SPEC.md

[uncategorized] ~298-~298: Loose punctuation mark.
Context: ... of PrivateBlobs. - PublicState: Contains only public data and Opaque Po...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~299-~299: Loose punctuation mark.
Context: ...lobally verifiable. - PrivateBlobs: The raw data that was redacted or point...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~326-~326: Loose punctuation mark.
Context: ...forward-compatible } ``` - digest: The REQUIRED `blake3` hash of the p...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~327-~327: Loose punctuation mark.
Context: ...expose this value. - ciphertext_digest: The blake3 hash of the stored ciphert...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~328-~328: Loose punctuation mark.
Context: ... present in the public pointer. - size: The size of the private blob in bytes (...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~329-~329: Loose punctuation mark.
Context: ...lob in bytes (RECOMMENDED). - location: A REQUIRED stable URI indicating wh...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~330-~330: Loose punctuation mark.
Context: ... embed pre‑signed tokens. - capability: A REQUIRED reference to the authn/z...

(UNLIKELY_OPENING_PUNCTUATION)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: test
  • GitHub Check: audit
  • GitHub Check: Mermaid Diagrams Generate + Verify
🔇 Additional comments (35)
examples/v1/shiplog/checkpoint_commit_only_invalid.json (1)

1-3: Discard this review—schema and ADR-0005 already document checkpoint requirements.

The schema at schemas/v1/shiplog/consumer_checkpoint.schema.json explicitly declares both fields mandatory: "required": ["ulid", "commit_oid"]. ADR-0005, Section 7 ("Consumer Checkpoints"), normatively specifies the checkpoint structure and references this exact schema. The filename suffix *_invalid signals the file's status consistently with checkpoint_ulid_only_invalid.json (which also lacks inline comments). The suggested JSON comment syntax is invalid—JSON does not support comments and your diff would break the file. No other examples in the repository use inline comments; all are pure JSON paired with schema validation and ADR documentation.

The example files are designed to be schema-validated, not self-documenting. The discoverability pathway is: filename pattern → schema → ADR. This is the established pattern in this codebase.

Likely an incorrect or invalid review comment.

docs/decisions/README.md (1)

23-24: LGTM — Clean ADR entries.

Both entries follow the established format precisely. Status values, date format, and link structure are consistent with existing records.

docs/USE-CASES.md (1)

86-95: LGTM — Well-articulated privacy use-case.

The new use-case correctly describes the hybrid privacy model's value proposition: verifiable audit trails without exposing sensitive data. Terminology aligns with ADR-0004.

docs/decisions/ADR-0003/DECISION.md (1)

89-92: LGTM — Deterministic sorting and mandatory persistence.

Both changes strengthen the consensus proof specification:

  1. Lexicographic sorting by lowercase Signer ensures canonical, reproducible ordering across implementations.
  2. Elevating storage to MUST ensures proof envelopes are always available for verification.
scripts/killcheck/schema_headers.sh (1)

1-10: LGTM — Correct header enforcement logic.

The pipeline correctly identifies legacy Schema: headers while excluding the permitted Envelope-Schema: and Trailer-Schema: variants. Output limiting via sed prevents console spam.

scripts/killcheck/ulid_reference.sh (1)

1-9: LGTM — Precise reference validation.

The script correctly enforces the presence of the specific ULID spec reference in ADR-0005. The exact string match ensures consistent external citations.

docs/decisions/ADR-0004/DECISION.md (9)

1-11: Verify ADR metadata and scope clarity.

The ADR header lists ADR-0004 (line 4) and PR objectives state ADR-0005 Shiplog. Clarify the relationship: Is ADR-0004 (Privacy Model) the prerequisite for ADR-0005 (Shiplog), or should they have a different ordering?


83-94: Normative encryption algorithm mandate is clear and strict.

XChaCha20-Poly1305 (line 85) is mandated as the only normative algorithm, superseding prior AES-256-GCM references. This is a critical security decision. Verify:

  1. The Rust implementation (crates/gatos-privacy) uses XChaCha20-Poly1305 exclusively by default.
  2. No fallback to AES-256-GCM exists in production code paths (only the deprecated non-normative interop profile in lines 292–293).

113-122: RFC 8785 JCS canonicalization is normative for all canonical JSON—verify implementation.

Line 122 mandates: "content_id of the pointer itself is blake3(JCS(pointer_json))… This rule is normative for all canonical JSON in GATOS."

This is a critical normative requirement. Verify:

  1. crates/gatos-privacy/src/lib.rs implements RFC 8785 JCS canonicalization for OpaquePointer JSON.
  2. All pointer serialization paths use JCS before hashing, not default serde_json.
  3. Tests confirm that different JSON serializations (whitespace, key order) produce identical content_id after JCS.

124-124: Confirm opaque pointer schema file exists.

Line 124 references schemas/v1/privacy/opaque_pointer.schema.json. This must exist and align with the OpaquePointer specification in lines 95–122 (required fields, constraints, extensions).


128-136: AAD binding specification is precise but implementation-critical.

Lines 128–136 define the exact AAD composition: pointer.digest, requester.actor_id, policy.version. This is normative for AEAD correctness. Verify:

  1. All blob decryption paths in the implementation construct AAD in this exact order.
  2. Domain separation is explicit and documented (suggested: length-prefixing or clear delimiters).
  3. Tests confirm AAD mismatches produce Unauthorized or decryption failures.

138-167: Projection Function determinism is critical—verify canonical ordering.

Lines 149–152 mandate:

  • All JSON artifacts MUST be canonicalized with RFC 8785 JCS prior to hashing.
  • Non-JSON maps MUST have keys ordered lexicographically by lowercase UTF-8 bytes.

This is essential for deterministic, reproducible state commits. Verify:

  1. The State Engine (gatos-echo) or policy engine implements this ordering for all shapes and maps.
  2. Tests confirm that identical logical state, serialized in different orders, produces identical PublicState commits.

169-224: Pointer Resolution Protocol defines HTTP semantics clearly; verify bearer token and AAD validation.

Lines 169–224 specify:

  • POST /gatos/private/blobs/resolve endpoint.
  • JWT bearer token with required claims: sub, aud, method, path, exp, nbf.
  • Clock skew tolerance: ±300 seconds (line 178).
  • Response headers: X-BLAKE3-Digest, Digest.

Verify:

  1. The resolver implementation validates all JWT claims and enforces the exact claim set (not a superset).
  2. Clock skew is checked within ±300 seconds, not more lenient.
  3. Response headers are always populated for both success and specific error cases.
  4. Tests cover missing/invalid token (401), policy denial (403), and successful resolution paths.

289-299: Experimental keyed-BLAKE3 variant must remain separate from normative public pointers.

Lines 295–299 allow algo = "blake3-keyed" for private attestation envelopes only. Public pointers MUST use algo = "blake3" (line 299). This separation is critical for third-party verifiability.

Verify:

  1. Schema or code enforces that only private (non-public) pointers can use keyed variants.
  2. Validation rejects blake3-keyed in public Opaque Pointers.

301-310: Error taxonomy must align with implementation enum.

Lines 301–310 define normative error codes (Unauthorized 401, Forbidden 403, NotFound 404, DigestMismatch 422, CapabilityUnavailable 503, PolicyDenied 403). Verify:

  1. crates/gatos-privacy/src/lib.rs PointerError enum includes all these variants (or compatible mappings).
  2. HTTP status codes are consistently mapped to these error types.
  3. Tests cover each error condition with correct status code.
examples/v1/privacy/pointer_low_entropy_invalid.json (1)

1-8: Negative test fixture intent is unclear—verify schema constraint alignment.

This file is included in the negative validation (Makefile line 81) and is meant to be invalid. Per ADR-0004:

  • Line 4 (plaintext digest): Violates low-entropy policy (publish_plaintext_digest: false). ✓ Correct for negative test.
  • Missing ciphertext_digest: ADR-0004 lines 114–115 state that low-entropy policy "SHOULD include ciphertext_digest" and "policy MUST NOT expose the plaintext digest."

Clarify: Is the negative test failing because:

  1. Plaintext digest is present (when forbidden), OR
  2. Ciphertext digest is missing (when required), OR
  3. Both?

If the schema enforces require_ciphertext_digest: true for low-entropy class, then this example should include a (valid) ciphertext_digest field and only fail on the plaintext digest. Otherwise, remove the field to isolate the failure reason.

examples/v1/shiplog/anchor_min.json (1)

1-8: Minimal anchor example structure is clean.

ULID format, git head reference, ISO 8601 timestamp all appear valid. Verify schema alignment:

  • ULID validation: pattern enforcement in schema.
  • head field: Git SHA-1 or Blake3 hash? Confirm schema constraint.
  • created_at: ISO 8601 format check in schema.

Confirm schemas/v1/shiplog/anchor.schema.json exists and validates this example.

examples/v1/shiplog/checkpoint_min.json (1)

1-5: Minimal checkpoint example has both required fields.

ULID and commit_oid are present. This is correct for a positive test. Verify:

  1. Schema schemas/v1/shiplog/consumer_checkpoint.schema.json enforces both as required (not optional).
  2. Negative examples referenced in Makefile (lines 78–79) exist:
    • examples/v1/shiplog/checkpoint_ulid_only_invalid.json
    • examples/v1/shiplog/checkpoint_commit_only_invalid.json
examples/v1/shiplog/event_min.json (1)

1-12: Shiplog event structure is reasonable; verify schema constraints on type and namespace.

The event includes ulid, type, payload, refs, and ns. Per ADR-0005 (Shiplog), verify:

  1. Schema enforces type to a specific enum (e.g., proposal.created is allowed, not arbitrary).
  2. Schema enforces ns to a valid namespace list (e.g., "governance" is in the allowed set).
  3. payload is validated against a sub-schema or allowed to be freeform JSON.
  4. refs.state digest format matches ADR-0004 Blake3 specification.

Confirm schemas/v1/shiplog/event_envelope.schema.json exists and enforces these constraints.

Cargo.toml (1)

10-10: Workspace member addition is clean and correctly positioned.

The crates/gatos-privacy entry is appropriately placed in the members list. Verify:

  1. crates/gatos-privacy/Cargo.toml exists with a valid manifest.
  2. The crate name, version, and edition are consistent with workspace settings (line 18: version "0.1.0", edition "2021").
  3. Public API exports OpaquePointer and PointerError types referenced in ADR-0004.
examples/v1/shiplog/event_mismatch_ns.json (1)

1-7: Negative test fixture exists but is not invoked in Makefile schema-negative tests.

This file is a valid negative test case (namespace "wrongspace" is invalid). However, the Makefile schema-negative targets (lines 73–81 and 89–98) do not reference it for validation. Either:

  1. Add the test: Include a line like ! npx -y ajv-cli@5 validate ... -d examples/v1/shiplog/event_mismatch_ns.json in schema-negative.
  2. Remove the fixture: Delete this file if namespace validation is not being tested negatively.

Without execution, this fixture is dead code and provides no test coverage.

Is namespace validation meant to be tested negatively? If yes, add this fixture to schema-negative or remove it.

Makefile (1)

39-52: All schema and example files exist. Verification passes for lines 39-52 and 58-71. make schema-compile and make schema-validate will not fail due to missing paths.

examples/v1/shiplog/trailer_min.json (2)

7-7: LGTM: Timestamp consistency verified.

The duration calculation is correct: end_ts - start_ts = 70 seconds matches the dur_s field. Excellent internal consistency.


1-8: Ignore this review comment—the schema compliance is correct.

The schema actually requires only three fields: version, status, and when. The example contains all three and is fully compliant. Fields like where, why, and how are optional properties, not required ones. The _min suffix accurately reflects this minimal-but-valid example.

Likely an incorrect or invalid review comment.

examples/v1/privacy/pointer_low_entropy_min.json (1)

1-8: LGTM: Low-entropy pointer correctly structured.

This example properly demonstrates the low-entropy constraint:

  • Omits plaintext digest
  • Includes ciphertext_digest
  • Sets extensions.class to "low-entropy" ✓

The structure aligns with the documented low-entropy validation rules.

schemas/v1/shiplog/event_envelope.schema.json (2)

42-42: Excellent security guidance in description.

The description properly warns against embedding private data in payloads and mandates opaque pointer usage. This is essential for the privacy model and shows thoughtful schema design.


32-32: Schema reference is valid and properly configured.

The external reference to ../common/ids.schema.json#/$defs/blake3Digest resolves correctly. The blake3Digest definition enforces the required format with a strict regex pattern (^blake3:[0-9a-f]{64}$) that validates lowercase hex-encoded 32-byte BLAKE3 digests with the prefix. Schema validation passes without errors.

crates/gatos-privacy/Cargo.toml (1)

7-10: LGTM: Good dependency hygiene with workspace versions.

Using workspace = true for core dependencies ensures consistent versions across the monorepo and simplifies maintenance.

examples/v1/policy/privacy_min.json (1)

13-13: The action field is properly constrained in the schema; this review comment is incorrect.

The schema verification confirms that the action field is already constrained as an enum with exactly two allowed values: "redact" and "pointerize". The example usage in examples/v1/policy/privacy_min.json is valid and compliant. The original concern about unconstrained actions and the speculation about additional actions like "hash", "encrypt", and "drop" are unfounded—these do not exist in the schema definition.

Likely an incorrect or invalid review comment.

examples/v1/privacy/opaque_pointer_min.json (2)

4-5: Schema explicitly allows both digests for non-low-entropy pointers—review concern is invalid.

The schema does define this through anyOf and conditional logic:

  • Low-entropy (extensions.class="low-entropy"): ciphertext_digest required, digest forbidden (via if/then/not)
  • Non-low-entropy (default): anyOf requires at least one of digest or ciphertext_digest—both are allowed

The example file opaque_pointer_min.json lacks extensions.class="low-entropy", so it defaults to non-low-entropy. Both digests are schema-valid per the anyOf constraint. No schema clarification needed—the conditional logic already documents the distinction.


4-4: Placeholder digest values confirmed but validation gap requires clarification.

The JSON file contains the claimed placeholder values:

  • digest: all zeros (blake3:0000...0000)
  • ciphertext_digest: all ones (blake3:1111...1111)

The OpaquePointer::validate() method in lib.rs only enforces presence and low-entropy constraints—it does NOT reject weak/placeholder values. The test in pointer_schema.rs successfully deserializes and validates the example without cryptographic soundness checks.

Verification needed: Is this validation gap acceptable for example/test scenarios, or does production code need to reject these well-known weak patterns? If production paths can encounter such invalid digests, add cryptographic validity checks to validate().

docs/SPEC.md (2)

74-74: Cosmetic roadmap clarification noted.

The label change to "(planned)" aligns with indicating future development status. Ensure this change is coordinated with the project roadmap and architecture planning.


131-153: Private directory structure well‑organized and actor‑scoped.

The addition of refs/gatos/private/<actor-id>/ follows established conventions. Confirm that section 7.3 (line 362) description of "private overlay indices/metadata" aligns with what is actually materialized under this path—specifically, whether it holds only metadata or also ciphertext blobs, and how external store integration (e.g., S3) is discovered.

examples/v1/shiplog/checkpoint_ulid_only_invalid.json (2)

1-3: Verify alignment with ADR-0005 checkpoint specification.

Per the AI summary, this ties to "ULID monotonic/ordering references" in ADR-0005. Without access to that ADR document and the formal checkpoint schema:

  • Cannot confirm whether this example correctly demonstrates the "invalid" case as specified in ADR-0005
  • Cannot verify that omitting other fields (commit, anchor, timestamp, etc.) is the correct way to represent an invalid checkpoint
  • Cannot validate whether any additional metadata or comments should accompany example files

Please confirm:

  1. Does ADR-0005 define the canonical schema for checkpoints?
  2. What field(s) are required vs. optional?
  3. Is omission of all fields except "ulid" the intended invalid case?

1-3: Original review comment is incorrect—file is properly structured.

The checkpoint schema explicitly requires both "ulid" and "commit_oid" fields. The checkpoint_ulid_only_invalid.json file correctly violates this constraint by containing only "ulid", making it a valid negative example. The naming convention is consistent with checkpoint_commit_only_invalid.json. No issues found.

Likely an incorrect or invalid review comment.

Comment on lines +1 to +4
[package]
name = "gatos-privacy"
version = "0.1.0"
edition = "2021"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Missing package metadata.

The manifest lacks essential fields:

  • description (required for publication to crates.io)
  • license or license-file (legally required)
  • repository (helps users find source)
  • authors or readme

Add metadata:

 [package]
 name = "gatos-privacy"
 version = "0.1.0"
 edition = "2021"
+description = "GATOS privacy model: opaque pointers and low-entropy validation"
+license = "MIT OR Apache-2.0"  # or your chosen license
+repository = "https://github.com/flyingrobots/gatos"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[package]
name = "gatos-privacy"
version = "0.1.0"
edition = "2021"
[package]
name = "gatos-privacy"
version = "0.1.0"
edition = "2021"
description = "GATOS privacy model: opaque pointers and low-entropy validation"
license = "MIT OR Apache-2.0" # or your chosen license
repository = "https://github.com/flyingrobots/gatos"
🤖 Prompt for AI Agents
In crates/gatos-privacy/Cargo.toml around lines 1 to 4, the package manifest is
missing required metadata fields; add a description = "...", license = "..." (or
license-file = "LICENSE"), repository = "https://...", and either authors =
["Name <email>"] and/or readme = "README.md" (and ensure the referenced files
exist). Update the top-level [package] section to include those keys with
appropriate values and ensure the license string matches SPDX format so the
crate can be published.

serde_json = { workspace = true }
blake3 = { workspace = true }
hex = { workspace = true }
anyhow = { workspace = true }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Redundant error handling crates.

Both anyhow (line 11) and thiserror (line 12) are present. Typical usage:

  • thiserror for library error types (derive Error)
  • anyhow for application error handling (context, chaining)

Library crates usually only need thiserror. anyhow in a library often indicates the error API isn't well-designed (returning anyhow::Error loses type information for consumers).

Verify usage:

If anyhow is only used internally, consider removing it and using thiserror + Result<T, E> everywhere.


🏁 Script executed:

#!/bin/bash
# Check how anyhow is actually used in the crate

echo "=== anyhow usage in gatos-privacy ==="
rg -n --type=rust 'use anyhow|anyhow::' crates/gatos-privacy/src/

echo -e "\n=== thiserror usage ==="
rg -n --type=rust 'use thiserror|thiserror::' crates/gatos-privacy/src/

echo -e "\n=== Public error types ==="
rg -n --type=rust 'pub.*Error|pub enum.*Error' crates/gatos-privacy/src/

Length of output: 326


Remove unused anyhow dependency.

Script output confirms anyhow is never imported or used anywhere in the crate (crates/gatos-privacy/src/). The crate correctly uses thiserror to define PointerError (line 75) and returns properly typed Result<(), PointerError> in the public API (line 46). Dead code dependency should be purged from Cargo.toml line 11.

🤖 Prompt for AI Agents
In crates/gatos-privacy/Cargo.toml around line 11, the dependency entry "anyhow
= { workspace = true }" is unused in this crate; remove that line from
Cargo.toml to purge the dead dependency and then run cargo check/cargo build to
ensure no code depended on anyhow; if workspace-level changes require updating
lockfiles run cargo update and commit the updated Cargo.toml (and Cargo.lock if
modified).

blake3 = { workspace = true }
hex = { workspace = true }
anyhow = { workspace = true }
thiserror = "1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Loose thiserror version constraint enables breaking changes.

Using thiserror = "1" allows any version from 1.0.0 to 1.999.999, including breaking changes within the v1 major version (which happens in practice despite semver).

Pin to a specific minor version:

-thiserror = "1"
+thiserror = "1.0"

Or even better, use workspace version like the other dependencies:

-thiserror = "1"
+thiserror = { workspace = true }

Then define it in the workspace Cargo.toml for consistency.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
thiserror = "1"
thiserror = "1.0"
🤖 Prompt for AI Agents
In crates/gatos-privacy/Cargo.toml around line 12, the dependency thiserror is
specified as "1", which is too loose and can pull incompatible changes; update
the entry to pin a specific minor (for example "1.0.x" or "1.1" style) or,
better, replace it with a workspace-managed version placeholder (e.g. use
workspace = true or the same pattern used by other deps) and then add the
concrete version to the workspace Cargo.toml so all crates share a fixed
thiserror minor version; ensure the local Cargo.toml matches the workspace
convention and run cargo update to verify the lockfile.

Comment on lines +1 to +23
use gatos_privacy::OpaquePointer;

fn read_example(rel: &str) -> String {
let dir = env!("CARGO_MANIFEST_DIR");
std::fs::read_to_string(format!("{}/../../examples/v1/{}", dir, rel)).unwrap()
}

#[test]
fn ciphertext_only_pointer_should_deserialize() {
// This example omits plaintext digest by design (low-entropy class)
let json = read_example("privacy/pointer_low_entropy_min.json");
let ptr: Result<OpaquePointer, _> = serde_json::from_str(&json);
assert!(ptr.is_ok(), "ciphertext-only opaque pointer must deserialize");
}

#[test]
fn both_digests_allowed_when_not_low_entropy() {
let json = read_example("privacy/opaque_pointer_min.json");
let ptr: OpaquePointer = serde_json::from_str(&json).unwrap();
let has_digest = ptr.digest.as_ref().map(|s| !s.is_empty()).unwrap_or(false);
assert!(has_digest);
assert!(ptr.ciphertext_digest.is_some());
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Missing negative test cases.

The test suite only validates successful deserialization. Add tests for:

  • Invalid low-entropy pointer with plaintext digest present (should fail)
  • Pointer with neither digest nor ciphertext_digest (should fail)
  • Invalid ULID format (if applicable)
  • Invalid URI schemes
  • Malformed blake3 digests

Do you want me to generate comprehensive negative test cases, or open an issue to track this?

🤖 Prompt for AI Agents
In crates/gatos-privacy/tests/pointer_schema.rs around lines 1 to 23, add
negative unit tests to complement the existing positive cases: create a test
that loads the low-entropy example JSON and injects a non-empty plaintext digest
(expect serde_json::from_str to return Err), a test that removes both digest and
ciphertext_digest fields from a valid pointer JSON (expect Err), a test that
replaces the ULID field with an invalid string (expect Err or validation failure
during deserialization), a test that sets the URI scheme to an unsupported value
(expect Err), and a test that corrupts the blake3 digest string (wrong
length/characters) (expect Err); implement each by reading a base example (use
read_example), programmatically modifying the JSON string (or parsing to a
serde_json::Value and altering fields) and asserting deserialization fails.

Comment on lines +3 to +6
fn read_example(rel: &str) -> String {
let dir = env!("CARGO_MANIFEST_DIR");
std::fs::read_to_string(format!("{}/../../examples/v1/{}", dir, rel)).unwrap()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bare unwrap() will produce cryptic panic messages.

The read_example helper uses .unwrap() on file I/O without context. When this fails in CI, the error message will be unhelpful (just "called unwrap() on an Err value").

Apply this diff to provide actionable error context:

 fn read_example(rel: &str) -> String {
     let dir = env!("CARGO_MANIFEST_DIR");
-    std::fs::read_to_string(format!("{}/../../examples/v1/{}", dir, rel)).unwrap()
+    let path = format!("{}/../../examples/v1/{}", dir, rel);
+    std::fs::read_to_string(&path)
+        .unwrap_or_else(|e| panic!("Failed to read {}: {}", path, e))
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn read_example(rel: &str) -> String {
let dir = env!("CARGO_MANIFEST_DIR");
std::fs::read_to_string(format!("{}/../../examples/v1/{}", dir, rel)).unwrap()
}
fn read_example(rel: &str) -> String {
let dir = env!("CARGO_MANIFEST_DIR");
let path = format!("{}/../../examples/v1/{}", dir, rel);
std::fs::read_to_string(&path)
.unwrap_or_else(|e| panic!("Failed to read {}: {}", path, e))
}
🤖 Prompt for AI Agents
In crates/gatos-privacy/tests/pointer_schema.rs around lines 3 to 6, the helper
read_example uses .unwrap() on read_to_string which yields unhelpful panic
messages; replace the unwrap with a fallible call that includes the file path
and the I/O error in the message (for example use .expect or .map_err with a
formatted string containing the resolved path and the error) so failures in CI
show the exact file attempted and the underlying error; ensure the path is
computed first into a variable and included in the error message.

Comment on lines +23 to +25
"payload": {
"type": "object"
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Unbounded payload object enables DoS attacks.

The payload is an unconstrained object with no limits on:

  • Number of properties (maxProperties)
  • Nesting depth
  • String/array sizes within the payload

An attacker could submit a deeply nested or extremely large payload, exhausting parser memory.

Add defensive limits:

     "payload": {
-      "type": "object"
+      "type": "object",
+      "maxProperties": 100,
+      "description": "Event payload (must not contain PII/secrets; use opaque pointers for sensitive data)"
     },

Also consider using a JSON parser with depth limits (most parsers have a default ~100 levels, but verify your parser configuration).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"payload": {
"type": "object"
},
"payload": {
"type": "object",
"maxProperties": 100,
"description": "Event payload (must not contain PII/secrets; use opaque pointers for sensitive data)"
},
🤖 Prompt for AI Agents
In schemas/v1/shiplog/event_envelope.schema.json around lines 23–25, the
"payload" is currently an unconstrained object which can be abused to submit
extremely large or deeply nested payloads; constrain it by replacing the bare
object with a defensive schema: set maxProperties (e.g. 50), add
additionalProperties: true/false or a limited property schema, limit array sizes
with maxItems, limit string lengths with maxLength, and replace open recursion
with a bounded recursive/defined type (or enumerated allowed shapes) to enforce
a maximum nesting depth; also add explicit limits for any known substructures
(timestamps, ids, etc.) and include a note to verify the JSON parser is
configured with a maximum depth (or use a parser that enforces one).

Comment on lines +26 to +35
"refs": {
"type": "object",
"propertyNames": {
"pattern": "^[a-z][a-z0-9._-]{0,63}$"
},
"additionalProperties": {
"$ref": "../common/ids.schema.json#/$defs/blake3Digest"
},
"description": "Optional cross-references to related state roots or objects (blake3:<64-hex>)."
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Refs object lacks size limits.

Similar to the payload issue, refs has no maxProperties constraint. An attacker could submit thousands of refs entries.

Apply this diff:

     "refs": {
       "type": "object",
+      "maxProperties": 50,
       "propertyNames": {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"refs": {
"type": "object",
"propertyNames": {
"pattern": "^[a-z][a-z0-9._-]{0,63}$"
},
"additionalProperties": {
"$ref": "../common/ids.schema.json#/$defs/blake3Digest"
},
"description": "Optional cross-references to related state roots or objects (blake3:<64-hex>)."
},
"refs": {
"type": "object",
"maxProperties": 50,
"propertyNames": {
"pattern": "^[a-z][a-z0-9._-]{0,63}$"
},
"additionalProperties": {
"$ref": "../common/ids.schema.json#/$defs/blake3Digest"
},
"description": "Optional cross-references to related state roots or objects (blake3:<64-hex>)."
},
🤖 Prompt for AI Agents
In schemas/v1/shiplog/event_envelope.schema.json around lines 26 to 35, the
"refs" object lacks a size constraint; add a maxProperties limit (for example
"maxProperties": 256) to the "refs" schema alongside existing keys so clients
cannot submit unbounded numbers of refs entries, keeping the existing
propertyNames, additionalProperties $ref, and description intact.

Comment on lines +1 to +11
#!/usr/bin/env bash
set -euo pipefail

# Disallow common lowercase/underscore variants of error codes anywhere in docs
bad=$(rg -n "append_rejected|not_fast_forward|temporalorder|siginvalid|policydenied|notfound" docs || true)
if [[ -n "$bad" ]]; then
echo "Found non-canonical error code casing or names:" >&2
echo "$bad" >&2
exit 1
fi
echo "ok: error code casing canonical"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Document the canonical error code forms.

The script blocks specific non-canonical variants but doesn't indicate what the canonical forms should be. Include a comment listing the expected canonical forms (e.g., AppendRejected, NotFastForward, etc.) so maintainers know what to use.

Apply this diff:

 #!/usr/bin/env bash
 set -euo pipefail
 
-# Disallow common lowercase/underscore variants of error codes anywhere in docs
+# Disallow common lowercase/underscore variants of error codes anywhere in docs.
+# Canonical forms: AppendRejected, NotFastForward, TemporalOrder, SigInvalid, PolicyDenied, NotFound
 bad=$(rg -n "append_rejected|not_fast_forward|temporalorder|siginvalid|policydenied|notfound" docs || true)
🤖 Prompt for AI Agents
In scripts/killcheck/error_casing.sh around lines 1 to 11, add a short commented
list of the canonical error code forms directly above the rg check so
maintainers know the expected names (for example: AppendRejected,
NotFastForward, TemporalOrder, SigInvalid, PolicyDenied, NotFound); keep it as
plain shell comments (prefixed with #) and ensure the comment matches the
variants being blocked by the regex for clarity.

Comment on lines +14 to +27
let ns;
try {
const text = fs.readFileSync(file, 'utf8');
const obj = JSON.parse(text);
ns = obj.ns;
} catch (e) {
console.error('failed to read/parse envelope:', e.message);
process.exit(2);
}

if (ns !== topic) {
console.error(`topic/ns mismatch: topic=${topic} ns=${ns}`);
process.exit(1);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate that the ns field exists.

If the envelope JSON lacks an ns field, the script will produce a confusing error message (topic=X ns=undefined). Add explicit validation after parsing.

Apply this diff:

 let ns;
 try {
   const text = fs.readFileSync(file, 'utf8');
   const obj = JSON.parse(text);
   ns = obj.ns;
+  if (ns === undefined) {
+    throw new Error('envelope missing required "ns" field');
+  }
 } catch (e) {
   console.error('failed to read/parse envelope:', e.message);
   process.exit(2);
 }
🤖 Prompt for AI Agents
In scripts/shiplog/check_topic_ns.js around lines 14 to 27, after parsing the
envelope JSON you must explicitly validate that obj.ns is present and not
undefined/null; if it's missing, log a clear error (e.g. "missing ns field in
envelope") including the filename or topic context and exit with a non-zero
status (use the same exit code used for parse errors, e.g. process.exit(2));
only proceed to compare ns with topic if the ns field exists.

Comment on lines +8 to +9
elif command -v docker >/dev/null 2>&1; then
AJV_RUNNER=(docker run --rm -v "$PWD:/work" -w /work node:20 npx -y ajv-cli@5 ajv)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Docker fallback command is malformed

The docker fallback now runs npx -y ajv-cli@5 ajv …, which hands an extra ajv token to the CLI. That produces “Unknown command ajv” and bricks every compile/validate when Node isn’t installed locally—the exact scenario this branch is supposed to cover. Strip the redundant argument so the containerized runner behaves like the npx path.

-  AJV_RUNNER=(docker run --rm -v "$PWD:/work" -w /work node:20 npx -y ajv-cli@5 ajv)
+  AJV_RUNNER=(docker run --rm -v "$PWD:/work" -w /work node:20 npx -y ajv-cli@5)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif command -v docker >/dev/null 2>&1; then
AJV_RUNNER=(docker run --rm -v "$PWD:/work" -w /work node:20 npx -y ajv-cli@5 ajv)
elif command -v docker >/dev/null 2>&1; then
AJV_RUNNER=(docker run --rm -v "$PWD:/work" -w /work node:20 npx -y ajv-cli@5)
🤖 Prompt for AI Agents
In scripts/validate_schemas.sh around lines 8-9, the docker fallback command
includes an extra trailing "ajv" token causing "Unknown command ajv"; remove
that redundant argument from the docker run invocation so the containerized
command mirrors the npx path (i.e., run the container with "npx -y ajv-cli@5"
instead of "npx -y ajv-cli@5 ajv"), ensuring the AJV_RUNNER array is formed
without the extra token.

…ME with invariants and examples; tests for verified wrapper
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0b49cc and 1379a29.

📒 Files selected for processing (4)
  • crates/gatos-privacy/README.md (1 hunks)
  • crates/gatos-privacy/src/lib.rs (1 hunks)
  • crates/gatos-privacy/tests/verified.rs (1 hunks)
  • docs/decisions/ADR-0004/DECISION.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/decisions/ADR-0004/DECISION.md

[style] ~128-~128: Consider removing “of” to be more concise
Context: ... Opaque Pointer, the AEAD AAD MUST bind all of the following, in order, as UTF‑8 bytes: `...

(ALL_OF_THE)


[style] ~136-~136: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...ain separation (e.g., length‑prefixing) prior to supplying them as AAD to the AEAD algor...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~146-~146: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...acy Policy** (.gatos/policy.yaml). 3. It traverses the UnifiedState tree, appl...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~153-~153: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...MUST be canonicalized with RFC 8785 JCS prior to hashing. * When non‑JSON maps are mater...

(EN_WORDINESS_PREMIUM_PRIOR_TO)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Mermaid Diagrams Generate + Verify
  • GitHub Check: audit
🔇 Additional comments (3)
docs/decisions/ADR-0004/DECISION.md (2)

10-10: Verify that referenced Opaque Pointer schema file exists.

The ADR declares dependency on schemas/v1/privacy/opaque_pointer.schema.json (line 10, line 124) but this schema is not visible in the PR files. Ensure the schema is either already committed or included in this PR; otherwise, implementations cannot validate Opaque Pointer structure.

Can you confirm that schemas/v1/privacy/opaque_pointer.schema.json exists or is being added in this PR? If not, I can help draft a JSON Schema that enforces the normative requirements from lines 101–111.

Also applies to: 124-124


179-179: [INCORRECT_REVIEW_COMMENT] — RFC 7519 misrepresentation and architectural misunderstanding.

The original review fundamentally mischaracterizes RFC 7519 and GATOS design. RFC 7519 Section 4.1.2 explicitly allows sub as a StringOrURI value—any value containing colons must be a URI (RFC 3986), but URIs are fully compliant. The format ed25519:<pubkey> is a valid RFC 7519–compliant StringOrURI, not a deviation.

GATOS is deliberately pubkey-centric: actors, workers, and all principals are identified via ed25519:<key> throughout the specification (docs/SPEC.md, docs/TECH-SPEC.md, ADR-0004). Using pubkeys as the sub claim is architecturally consistent and intentional—not an interop friction point. Standard JWT libraries (jsonwebtoken, PyJWT, etc.) accept any StringOrURI without issue.

Disregard the review comment. The code is correct.

Likely an incorrect or invalid review comment.

crates/gatos-privacy/src/lib.rs (1)

20-23: Permits ciphertext-only pointers again.

Making digest optional restores compatibility with schema-conformant envelopes that ship only a ciphertext digest—exactly the regression we flagged earlier. Good course correction.

Comment on lines +86 to +87
* Nonces: 24-byte (192-bit) nonces MUST be unique per key. Implementations SHOULD use deterministic, domain‑separated nonces derived via HKDF from the pointer digest and context, or a crash‑safe, monotonic per‑key counter stored in KMS. Random nonces are permitted only with a documented collision budget and active monitoring.
* Catastrophic reuse: Nonce reuse under the same key is catastrophic and MUST be proven impossible by construction (deterministic derivation) or by counter invariants.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Specify nonce counter lifecycle and wraparound handling.

The normative requirement to use "crash‑safe, monotonic per‑key counter stored in KMS" lacks critical details: counter width, wraparound behavior, recovery semantics after crashes or KMS outages, and multi-instance coordination. A 24-byte nonce supports 2^192 values, but if the counter is 64-bit, you will hit wraparound. Specify whether wraparound triggers key rotation, encryption failure, or other semantics; otherwise, deployments face potential nonce reuse catastrophes.

🤖 Prompt for AI Agents
In docs/decisions/ADR-0004/DECISION.md around lines 86 to 87, the ADR must
define the nonce counter lifecycle and wraparound semantics: specify a concrete
counter width (e.g., use the full 192-bit space when deriving nonces or require
at least a 128-bit per-key counter with a documented safe rotation threshold),
mandate actions on imminent or actual wraparound (e.g., proactively rotate and
deprecate the key before the counter reaches a safety threshold, and treat any
attempted encryption after wraparound as a hard failure), and prescribe
crash/recovery and multi-instance coordination mechanisms (use atomic
conditional updates or KMS-provided monotonic counters/leases/fencing tokens,
transactional commits or quorum operations to make increments crash-safe, and
require leader-election or compare-and-swap semantics for concurrent writers).
Ensure the text requires documenting the chosen counter width, the threshold
that triggers rotation, exact recovery steps after KMS outage/crash, and that
nonce reuse is unacceptable so systems must fail closed or rotate keys before
any risk of reuse.

* AEAD algorithm: Implementations MUST use XChaCha20-Poly1305 for encrypting private blobs referenced by Opaque Pointers. This is the only normative algorithm for the privacy overlay. It supersedes any prior mentions of AES‑256‑GCM as a default.
* Nonces: 24-byte (192-bit) nonces MUST be unique per key. Implementations SHOULD use deterministic, domain‑separated nonces derived via HKDF from the pointer digest and context, or a crash‑safe, monotonic per‑key counter stored in KMS. Random nonces are permitted only with a documented collision budget and active monitoring.
* Catastrophic reuse: Nonce reuse under the same key is catastrophic and MUST be proven impossible by construction (deterministic derivation) or by counter invariants.
* AAD binding: AEAD AAD MUST bind the pointer digest (not a separate content_id), the requester actor id, and the effective policy version so that verifiers can validate context and detect misuse.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify AAD binding to disambiguate pointer digest vs. content_id.

Line 88 states AAD must bind "pointer digest (not a separate content_id)," but line 122 defines content_id = blake3(JCS(pointer_json)). The phrase "(not a separate content_id)" is ambiguous: does it mean AAD should bind the plaintext digest (line 113) and omit the canonical pointer hash? Or vice versa? The distinction is security-critical; implementations may differ on which hash to use, leading to AAD mismatches and decryption failures. Explicitly define: should AAD include the plaintext digest, the canonical JSON pointer hash, both, or neither?

Also applies to: 122-122, 126-134


#### AAD Components (Example)

When encrypting a private blob referenced by an Opaque Pointer, the AEAD AAD MUST bind all of the following, in order, as UTF‑8 bytes:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Polish prose: remove wordiness and sentence repetition per LanguageTool hints.

Four style issues reduce clarity:

  • Line 128: "bind all of the following" → "bind all following" (remove "of the")
  • Line 136: "prior to supplying" → "before supplying" (more direct)
  • Line 146: Three successive sentences begin with "It" — consider rewording for variety, e.g., "The resulting PublicState is committed…" and "Private blobs are persisted…"
  • Line 153: "prior to hashing" → "before hashing"
-When encrypting a private blob referenced by an Opaque Pointer, the AEAD AAD MUST bind all of the following, in order, as UTF‑8 bytes:
+When encrypting a private blob referenced by an Opaque Pointer, the AEAD AAD MUST bind all following, in order, as UTF‑8 bytes:
-Implementations MAY structure these as concatenated bytes with clear domain separation (e.g., length‑prefixing) prior to supplying them as AAD to the AEAD algorithm.
+Implementations MAY structure these as concatenated bytes with clear domain separation (e.g., length‑prefixing) before supplying them as AAD to the AEAD algorithm.
 1. It computes a **UnifiedState** by folding the complete event history.
 2. It consults the **Privacy Policy** (`.gatos/policy.yaml`).
-3. It traverses the `UnifiedState` tree, applying `redact` or `pointerize` rules.
+3. The State Engine traverses the `UnifiedState` tree, applying `redact` or `pointerize` rules.
     * `redact`: The field is removed from the public state.
     * `pointerize`: The field's value is stored as a private blob, and an Opaque Pointer is substituted in the public state.
-4. The resulting `PublicState` is committed to the public refs, and the `Private Blobs` are persisted to their specified `location`.
+4. Finally, the resulting `PublicState` is committed to the public refs, and the `Private Blobs` are persisted to their specified `location`.
-* All JSON artifacts produced during projection (including Opaque Pointers) MUST be canonicalized with RFC 8785 JCS prior to hashing.
+* All JSON artifacts produced during projection (including Opaque Pointers) MUST be canonicalized with RFC 8785 JCS before hashing.

Also applies to: 136-136, 146-146, 153-153

🧰 Tools
🪛 LanguageTool

[style] ~128-~128: Consider removing “of” to be more concise
Context: ... Opaque Pointer, the AEAD AAD MUST bind all of the following, in order, as UTF‑8 bytes: `...

(ALL_OF_THE)

🤖 Prompt for AI Agents
docs/decisions/ADR-0004/DECISION.md around lines 128, 136, 146 and 153: polish
wording per LanguageTool hints by removing redundant words and varying sentence
starts — change "bind all of the following" to "bind all following" on line 128,
change "prior to supplying" to "before supplying" on line 136, rephrase the
three sentences starting with "It" around line 146 to avoid repetition (e.g.,
"The resulting `PublicState` is committed…" and "Private blobs are persisted…"),
and change "prior to hashing" to "before hashing" on line 153; update the
sentences in-place preserving meaning and UTF‑8/AAD ordering instructions.

Comment on lines +255 to +260
```text
Privacy-Redactions: 3
Privacy-Pointers: 12
Privacy-Pointer-Rotations: 1
```text

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix malformed code block closure at line 259.

Line 259 has ```text but should be a single ``` to close the code block. The duplicate marker breaks markdown rendering.

 Privacy-Redactions: 3
 Privacy-Pointers: 12
 Privacy-Pointer-Rotations: 1
-```text
+```
🤖 Prompt for AI Agents
In docs/decisions/ADR-0004/DECISION.md around lines 255 to 260 the code block is
closed with "```text" instead of the required closing delimiter "```"; edit the
closing marker on line 259 to be a single triple-backtick ("```") so the code
block is properly terminated and markdown renders correctly.


* Capability references in pointers MUST NOT contain secrets or pre‑signed tokens. Use stable identifiers and resolve sensitive data via policy.
> Non‑normative interop profile: Some legacy deployments may use AES‑256‑GCM. If and only if such interop is required, deployments MAY support AES‑256‑GCM with AAD composed of (actor id, pointer digest, policy version) and 96‑bit nonces that are never reused per key. This profile is deprecated and MUST NOT be the default.
* Right‑to‑be‑forgotten: deleting private blobs breaks pointer resolution but does not remove the public pointer. Implement erasure as a tombstone event plus an audit record.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Specify blob deletion tombstone schema and client error semantics.

The right-to-be-forgotten guidance at line 295 says to "implement erasure as a tombstone event plus an audit record," but the tombstone schema, storage location, retention policy, and client-facing error semantics are undefined. Must clients receive 404 NotFound, a special BlobDeleted error, or should they infer deletion from a timeout? Where are tombstones persisted and for how long? Clarify these details to ensure consistent deletion behavior across implementations.

I can help draft a tombstone schema and error taxonomy for blob deletion if you'd like to iterate on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants