Skip to content

Support tag-filtered data access in CEL expressions #386

@stack72

Description

@stack72

Problem

CEL data access functions (data.latest(), model.X.resource.Y.Z) always return the most recent version of data, with no way to filter by tags. In a multi-environment setup where the same model runs for dev, staging, and prod, the data versions are interleaved:

.swamp/data/command/shell/{model-id}/result/
├── 1/       (dev run,     tags: environment=dev)
├── 2/       (prod run,    tags: environment=prod)
├── 3/       (staging run, tags: environment=staging)
├── 4/       (dev run,     tags: environment=dev)
└── latest -> 4

Any CEL expression referencing this model's data gets version 4 (the most recent dev run), regardless of what environment the consuming workflow or model is targeting:

${{ model["deploy-app"].resource.result.result.attributes.exitCode }}

There is no way to say "give me the latest result where environment=prod".

Why This Matters

Cross-model data references are a fundamental feature of swamp — models can read data produced by other models via CEL expressions. But in multi-environment setups, this is unsafe because there's no guarantee that the "latest" data belongs to the correct environment.

Consider this scenario:

  1. infra-scanner model runs for prod, writes infrastructure state
  2. infra-scanner model runs for dev, writes infrastructure state
  3. deploy-app model references ${{ model["infra-scanner"].resource.result.latest.attributes.vpcId }}
  4. deploy-app gets the dev VPC ID even though it's deploying to prod

This is a silent data correctness issue — no error is raised, the wrong data is simply used.

Proposed Solution

Extend CEL data access functions to accept an optional tag filter:

${{ data.latest("deploy-app", "result", {"environment": "prod"}).attributes.exitCode }}

Or a dedicated function:

${{ data.latestByTag("deploy-app", "result", "environment", "prod").attributes.exitCode }}

This would allow CEL expressions to safely reference data from a specific environment, even when versions from multiple environments are interleaved under the same model.

Use Case

A multi-environment deployment pipeline where:

  • An infra-scanner model runs periodically for each environment, writing infrastructure state
  • A deployer model references the scanner's output to get environment-specific configuration (VPC IDs, subnet lists, security groups)
  • The deployer must read the correct environment's infrastructure state, not whichever environment happened to run most recently

Without tag-filtered data access, users must carefully coordinate run ordering or maintain separate model instances per environment, defeating the purpose of reusable parameterized models.

Metadata

Metadata

Assignees

Labels

in-discussionA feature or issue that is in active discussion

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions