Skip to content

Comments

feat: auto-store sensitive output fields in vault before persistence#446

Merged
stack72 merged 5 commits intomainfrom
feat/sensitive-output-vault-storage-v2
Feb 24, 2026
Merged

feat: auto-store sensitive output fields in vault before persistence#446
stack72 merged 5 commits intomainfrom
feat/sensitive-output-vault-storage-v2

Conversation

@stack72
Copy link
Contributor

@stack72 stack72 commented Feb 24, 2026

Closes #433
Closes #447

Summary

  • When upstream APIs return sensitive data (e.g., EC2 CreateKeyPair returns KeyMaterial), all fields were previously written as plaintext to .swamp/.data/
  • This change automatically detects fields marked with { sensitive: true } in Zod schema metadata and stores their values in the configured vault, replacing the persisted value with a CEL-compatible vault reference expression
  • Zero impact on existing users — purely additive, no existing behavior changes

What users can now do

Mark individual output fields as sensitive in resource output specs:

resources: {
  result: {
    schema: z.object({
      keyId: z.string(),
      keyMaterial: z.string().meta({ sensitive: true }),
    }),
    lifetime: "infinite",
    garbageCollection: 10,
  },
}

Mark an entire resource spec's output as sensitive:

resources: {
  result: {
    schema: z.object({ ... }),
    sensitiveOutput: true,
    vaultName: "my-vault",  // Optional
    ...
  },
}

Persisted files will contain vault references instead of plaintext:

${{ vault.get('vault-name', 'modelType/modelId/methodName/fieldPath') }}

Plan vs implementation

Area Plan Implementation
Where sensitiveOutput lives ResourceOutputSpec (didn't exist at plan time) ResourceOutputSpec (now exists — per-spec, cleaner than per-method)
Where processing happens Modify createResourceWriter() Inside createResourceWriter() before JSON.stringify(data) — transparent to methods
VaultService factory fromConfig() (sync, reads .swamp.yaml) fromRepository() (already exists on main, async, reads .swamp/vault/)
Mutation safety Not addressed in plan structuredClone() snapshot before mutation prevents cross-contamination
Nested field handling Not addressed Handled via setNestedValue() without creating spurious literal dot-keys

The divergences were driven by the codebase evolving significantly since the plan was written (DataWriter/DataHandle architecture, ResourceOutputSpec, unified data repository).

Architecture benefits

  • Declarative security: Model authors declare sensitivity in the schema via .meta(), not in imperative code. The runtime enforces it inside the DataWriter pipeline.
  • Transparent to methods: Methods call context.writeResource() as normal. Sensitive field processing happens inside the writer — no changes needed in method implementations.
  • Fail-hard on misconfiguration: If sensitive fields exist but no vault is configured, the operation fails with a clear error rather than silently writing plaintext.
  • Vault resolution hierarchy: Field vaultName > spec vaultName > first available vault.
  • CEL-compatible references: Single-quoted string arguments prevent the CEL parser from interpreting slashes in auto-generated keys as division.
  • Snapshot-before-mutation: Deep clone prevents one field's vault ref from leaking into another field's stored value.

Files changed

New:

  • src/domain/models/sensitive_field_extractor.ts — Zod schema walker for { sensitive: true } metadata
  • src/domain/models/sensitive_field_extractor_test.ts — 16 unit tests
  • integration/sensitive_field_vault_test.ts — 3 integration tests

Modified:

  • src/domain/models/model.tssensitiveOutput/vaultName on ResourceOutputSpec, vaultService on MethodContext
  • src/domain/models/data_writer.tsprocessSensitiveResourceData() + injection into createResourceWriter()
  • src/domain/models/data_writer_test.ts — 13 sensitive field unit tests
  • src/domain/models/method_execution_service.ts — pass vaultService/methodName to createResourceWriter()
  • src/cli/commands/model_method_run.ts — create VaultService, pass to context
  • src/domain/workflows/execution_service.ts — same vault wiring
  • design/vaults.md — updated "Sensitive Field Marking" section to reflect implementation

Test plan

  • 16 unit tests for sensitive field extraction (schema walking, nested objects, metadata orderings)
  • 13 unit tests for sensitive resource data processing (vault references, storage, CEL format, nested paths, snapshots, sensitiveOutput)
  • 3 integration tests (end-to-end sensitive field → vault → reference flow)
  • deno check passes
  • deno lint passes
  • deno fmt passes
  • deno run test — 1909 tests, 0 failures

🤖 Generated with Claude Code

@stack72 stack72 marked this pull request as draft February 24, 2026 17:33
github-actions[bot]
github-actions bot previously approved these changes Feb 24, 2026
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: feat: auto-store sensitive output fields in vault before persistence

Summary

This PR implements automatic sensitive field detection and vault storage for resource outputs. The implementation is solid, well-tested, and follows the project's DDD principles.

No Blocking Issues ✅

The code passes all review criteria:

Code Quality & TypeScript Strict Mode:

  • All new code uses proper TypeScript types with no any usage
  • Named exports used consistently throughout
  • AGPLv3 headers present in all new files

Domain-Driven Design Compliance:

  • processSensitiveResourceData() is appropriately a domain service function - it orchestrates operations across the vault service and data transformation
  • extractSensitiveFields() is a pure function that extracts metadata from Zod schemas - follows value object extraction patterns
  • SensitiveFieldInfo is a proper value object (immutable, identified by value)
  • Extensions to ResourceOutputSpec (sensitiveOutput, vaultName) are appropriate - spec types are value objects that declare intent
  • The changes maintain separation between domain logic and infrastructure concerns

Test Coverage:

  • 16 unit tests for sensitive_field_extractor.ts covering schema walking, nested objects, optional/nullable wrappers, metadata orderings
  • 13 unit tests for processSensitiveResourceData() covering vault references, CEL format, nested paths, snapshots, error cases
  • 3 integration tests for end-to-end flow
  • Total: 29 new tests with 0 failures (1909 total tests passing)

Security:

  • ✅ Uses structuredClone() to snapshot values before mutation (prevents cross-contamination)
  • ✅ Fails hard when no vault is configured (no silent plaintext storage)
  • ✅ Single-quoted strings in CEL expressions prevent parser issues with slashes in auto-generated keys
  • ✅ Vault resolution hierarchy is clear: field metadata > spec-level > first available vault

Architecture Benefits:

  • Declarative security model (mark in schema, runtime enforces)
  • Transparent to method implementations (no changes needed in methods)
  • Proper error handling with actionable messages

Suggestions (non-blocking)

  1. Unused interface property (src/domain/models/sensitive_field_extractor.ts:39):
    The vault?: boolean property in SensitiveMetadata interface is declared but never used. The updated design/vaults.md documentation correctly omits this property. Consider removing it to align the code with the documentation:
    interface SensitiveMetadata {
      sensitive?: boolean;
      // vault?: boolean;  // Remove - not used in implementation
      vaultName?: string;
      vaultKey?: string;
    }

Files Reviewed

  • src/domain/models/sensitive_field_extractor.ts (new)
  • src/domain/models/sensitive_field_extractor_test.ts (new)
  • src/domain/models/data_writer.ts (modified)
  • src/domain/models/data_writer_test.ts (modified)
  • src/domain/models/model.ts (modified)
  • src/domain/models/method_execution_service.ts (modified)
  • src/cli/commands/model_method_run.ts (modified)
  • src/domain/workflows/execution_service.ts (modified)
  • integration/sensitive_field_vault_test.ts (new)
  • design/vaults.md (modified)

LGTM 👍

stack72 and others added 5 commits February 24, 2026 18:26
…433)

When upstream APIs return sensitive data (e.g., EC2 CreateKeyPair returns
KeyMaterial), all fields were previously written as plaintext to .swamp/.data/.
This change automatically detects fields marked with `{ sensitive: true }` in
Zod schema metadata and stores their values in the configured vault, replacing
the persisted value with a CEL-compatible vault reference expression.

## What changed

Extension model authors can now mark output fields as sensitive in their
resource output spec schema:

```typescript
resources: {
  result: {
    schema: z.object({
      keyId: z.string(),
      keyMaterial: z.string().meta({ sensitive: true }),
    }),
    lifetime: "infinite",
    garbageCollection: 10,
  },
}
```

Or mark an entire resource spec's output as sensitive:

```typescript
resources: {
  result: {
    schema: z.object({ ... }),
    sensitiveOutput: true,
    ...
  },
}
```

Sensitive values are stored in the vault and replaced with:
`${{ vault.get('vault-name', 'modelType/modelId/methodName/fieldPath') }}`

## Plan vs implementation

The original plan proposed adding sensitiveOutput to a non-existent
ResourceOutputSpec and modifying a createResourceWriter() function that
didn't exist yet. The codebase evolved significantly during implementation:

- **ResourceOutputSpec now exists** with schema, lifetime, garbageCollection,
  and tags. sensitiveOutput and vaultName were added here (per-spec, not
  per-method), which is cleaner since each spec has its own schema.

- **Processing is injected inside createResourceWriter()** before
  JSON.stringify(data). This is the natural interception point since
  writeResource() takes raw data, validates against the schema, and
  serializes. Sensitive field processing happens between validation and
  serialization — transparent to the method author.

- **VaultService.fromRepository()** already exists on main (async, loads from
  .swamp/vault/ directory). No need for the planned fromConfig() method.

- **Snapshot-before-mutation** prevents cross-contamination when multiple
  sensitive fields exist. Values are deep-cloned via structuredClone() before
  any mutation.

- **processSensitiveResourceData()** is exported for direct testing but the
  primary entry point is through createResourceWriter() which calls it
  automatically when vaultService and methodName are provided.

## Architecture benefits

- **Declarative security**: Model authors declare sensitivity in the schema
  via .meta(), not in imperative code. The runtime enforces it automatically
  inside the DataWriter pipeline.

- **Fail-hard on misconfiguration**: If sensitive fields exist but no vault
  is configured, the operation fails with a clear error rather than silently
  writing plaintext.

- **Vault resolution hierarchy**: Field vaultName > spec vaultName > first
  available vault.

- **CEL-compatible references**: Single-quoted string arguments prevent the
  CEL parser from interpreting slashes in auto-generated keys as division.

- **Transparent to methods**: Methods call context.writeResource() as normal.
  Sensitive field processing happens inside the writer pipeline — no changes
  needed in method implementations.

## User impact

Zero impact on existing users. Purely additive — no existing behavior changes.
Models without sensitive metadata behave identically. The only new requirement:
if a model declares sensitive fields but no vault is configured, the method run
will fail with guidance to create a vault.

## Files

New:
- src/domain/models/sensitive_field_extractor.ts — Zod schema walker
- src/domain/models/sensitive_field_extractor_test.ts — 16 unit tests
- integration/sensitive_field_vault_test.ts — 3 integration tests

Modified:
- src/domain/models/model.ts — sensitiveOutput/vaultName on ResourceOutputSpec,
  vaultService on MethodContext
- src/domain/models/data_writer.ts — processSensitiveResourceData() + injection
  into createResourceWriter()
- src/domain/models/data_writer_test.ts — 13 sensitive field unit tests
- src/domain/models/method_execution_service.ts — pass vaultService/methodName
  to createResourceWriter()
- src/cli/commands/model_method_run.ts — create VaultService, pass to context
- src/domain/workflows/execution_service.ts — same vault wiring
- design/vaults.md — updated Sensitive Field Marking section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ld feature

- Remove unused `vault?: boolean` from SensitiveMetadata interface
- Add sensitive field documentation to swamp-extension-model skill
- Add automatic sensitive field storage section to swamp-vault skill

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
)

Model types like `@user/aws/ec2-keypair` contain `@` and `/` characters
that are rejected by vault key validation (local_encryption maps keys
to filenames). The auto-generated vault key now sanitizes these characters:
`@` is removed, `/` and `\` are replaced with `-`.

Before: `@user/aws/ec2-keypair/{id}/create/KeyMaterial` (rejected)
After:  `user-aws-ec2-keypair-{id}-create-KeyMaterial` (valid)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds logger.info during processSensitiveResourceData so users see which
fields were stored, in which vault, and under what key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stack72 stack72 force-pushed the feat/sensitive-output-vault-storage-v2 branch from 5df169e to 7f9c4b6 Compare February 24, 2026 18:26
@stack72 stack72 marked this pull request as ready for review February 24, 2026 18:27
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR implements automatic vault storage for sensitive output fields - a well-designed security feature that prevents plaintext secrets from being persisted to disk.

Blocking Issues

None found.

Code Quality Assessment

TypeScript strict mode compliance: ✅

  • All new code uses proper types with no any types
  • The ZodDef interface in sensitive_field_extractor.ts:46-51 properly defines internal Zod structure types

Named exports: ✅

  • All functions and interfaces use named exports (e.g., export function extractSensitiveFields, export interface SensitiveFieldInfo)

License headers: ✅

  • All new .ts files include the AGPLv3 copyright header

Test coverage: ✅

  • 16 unit tests for sensitive field extraction (sensitive_field_extractor_test.ts)
  • 13 unit tests for sensitive data processing (data_writer_test.ts)
  • 3 integration tests for end-to-end flow (integration/sensitive_field_vault_test.ts)
  • Good edge case coverage: nested objects, null/undefined values, custom vaultName/vaultKey, sensitiveOutput flag

DDD Compliance

  • SensitiveFieldInfo is a proper Value Object (immutable, equality by value)
  • extractSensitiveFields, getNestedValue, setNestedValue are pure functions for schema walking
  • processSensitiveResourceData is a Domain Service (stateless, coordinates VaultService and data transformation)
  • No anemic domain models or leaking persistence concerns

Security Review

  • Path traversal prevention: sanitizeVaultKey() properly removes @, /, \, .., and null bytes from vault keys
  • Data integrity: Uses structuredClone() to snapshot data before mutation, preventing cross-contamination between fields
  • Fail-hard security: Throws clear error when sensitive fields exist but no vault is configured - no silent plaintext fallback
  • CEL-compatible references: Uses single-quoted strings in vault expressions to prevent parser issues

Architecture Notes

Good separation of concerns:

  • Schema introspection isolated in sensitive_field_extractor.ts
  • Processing logic in processSensitiveResourceData() within data_writer.ts
  • Transparent to method implementations - no changes needed in existing methods

Suggestions (non-blocking)

  1. Consider adding a test case for deeply nested objects combined with sensitiveOutput: true to verify both schema-marked and spec-level sensitive fields interact correctly at depth > 2.

  2. The vault key format modelType/modelId/methodName/fieldPath is well-documented in the design doc, which is helpful for debugging.

LGTM! Well-structured implementation with comprehensive testing and proper security considerations.

@stack72 stack72 merged commit 8ac4e79 into main Feb 24, 2026
3 of 6 checks passed
@stack72 stack72 deleted the feat/sensitive-output-vault-storage-v2 branch February 24, 2026 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant