-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Context
While designing the vary dimensions feature (#384), we identified a tension between the two CEL data access patterns that makes vary harder to implement cleanly.
The Two Access Patterns Today
| Pattern | Sees current-run data? | Sees prior-run data? | Syntax |
|---|---|---|---|
model["name"].resource.spec.data |
Yes — updated in-memory after each step | Yes | Dictionary traversal |
data.latest("name", "data") |
No — snapshot from workflow start | Yes | Function call |
Both exist intentionally:
model.*is mutable/live — enables step-to-step chaining within a workflow rundata.latest()is an immutable snapshot — provides consistency and isolation for cross-run access
The Problem with Vary
The vary feature (#384) needs a way to dynamically resolve the composite dataName from vary dimensions. A function call can do this naturally:
${{ data.latest("deploy-app", "result", [inputs.environment]).attributes.exitCode }}
But model.* is dictionary traversal — there's no function to extend. Within a workflow, you'd be stuck with manual string concatenation:
${{ model["deploy-app"].resource["result"]["result-" + inputs.environment].attributes.exitCode }}
This is exactly the manual name construction that vary is designed to eliminate.
The fundamental tension: the access pattern that benefits from vary (data.latest()) doesn't see current-run data, and the one that sees current-run data (model.*) can't benefit from vary.
Proposal: Make data.latest() see current-run data
If data.latest() were updated after each workflow step (same as the model context is today), it could serve as the single accessor that supports both vary and current-run data:
# Within a workflow — sees current-run data, with vary support
${{ data.latest("deploy-app", "result", [inputs.environment]).attributes.exitCode }}
# Cross-workflow — sees historical data, with vary support
${{ data.latest("deploy-app", "result", [inputs.environment]).attributes.exitCode }}
# Same syntax works in both cases
The model.* dot-notation would still exist for simple cases where vary isn't needed:
# Simple access without vary — still works
${{ model["deploy-app"].resource.result.result.attributes.exitCode }}
What this means for implementation
- After each workflow step completes, update the data cache (same as the model context update that already happens in
execution_service.ts) data.latest()shifts from a pure snapshot to a live view within workflow execution- Outside of workflows (e.g.,
swamp model evaluate), behavior stays the same — reads from disk
Trade-offs to discuss
For:
- Single accessor that supports both vary and current-run data
- No manual string concatenation for vary dimensions
data.latest()becomes more useful within workflows- Consistent behavior —
data.latest()always returns the most recent data, whether from this run or a prior run
Against:
- Changes the snapshot isolation guarantee of
data.latest()within workflows - Could cause surprises if users expect
data.latest()to be stable throughout a workflow run - Adds complexity to the data cache (needs to support mid-run updates)
- Two concurrent workflow runs could see each other's intermediate state if they share the same data cache (though this is already true for
model.*)
Questions for Discussion
- Is the snapshot isolation of
data.latest()something users depend on, or is it an implementation detail? - Should the live behavior be opt-in (e.g., a separate function like
data.current()) rather than changingdata.latest()? - Are there other approaches we haven't considered?