Classify runtime errors across workflow boundaries by daryllimyt · Pull Request #2709 · TracecatHQ/tracecat

daryllimyt · 2026-05-16T22:00:05Z

Checklist

Read CONTRIBUTING.md.
PR title is short and non-generic (see previously merged PRs for examples).
PR only implements a single feature or fixes a single bug.
Tests passing (uv run pytest tests)?
Lint / pre-commits passing (pre-commit run --all-files)?

Description

This PR adds first-class runtime error classification for workflow execution and carries that classification across Temporal boundaries.

It introduces runtime error envelopes for user, platform, and infra failures, adds Temporal adapters for activity and workflow-originated failures, and wires those envelopes into DSL action execution, scheduler failures, trigger-input normalization, registry lock resolution, agent setup, tier/workspace activities, and related activity boundaries.

The PR also adds DSLWorkflowV2 behind TRACECAT__FEATURE_FLAGS=DSL_WORKFLOW_V2 so new executions can target the v2 workflow type while existing histories stay on the current workflow. Worker registration, workflow start paths, schedule update/start handling, generated frontend types, and event/history parsing are updated for both workflow types.

The mental model implemented here is:

User errors: caused by user-authored config, permissions, input, or user code.
Platform errors: caused by Tracecat orchestration/runtime invariants.
Infra errors: caused by backing services, storage, networking, or OS/resource failures.
Activity boundaries classify newly raised failures with ActivityRuntimeError.
Workflow-originated failures classify with WorkflowRuntimeError.
Workflow wrappers translating an activity failure preserve the activity's runtime envelope instead of reclassifying it.

Related Issues

N/A

Screenshots / Recordings

N/A

Steps to QA

Focused verification run locally:

uv run pytest tests/unit/test_materialize_context.py tests/unit/test_agent_preset_activities.py tests/unit/test_agent_activities.py tests/unit/test_workflow_definitions_activities.py tests/unit/test_tier_activities.py tests/unit/test_workspace_org_resolution.py tests/unit/test_registry_sync_workflow.py tests/unit/test_executor_activities.py
uv run pytest tests/unit/test_dsl_workflow_error_unwrap.py tests/unit/test_materialize_context.py
uv run basedpyright tracecat/temporal/errors.py tracecat/dsl/workflow.py tests/unit/test_dsl_workflow_error_unwrap.py
uv run ruff check tracecat/temporal/errors.py tracecat/dsl/workflow.py tests/unit/test_dsl_workflow_error_unwrap.py
uv run ruff format --check tracecat/temporal/errors.py tracecat/dsl/workflow.py tests/unit/test_dsl_workflow_error_unwrap.py

Commit-time hooks also passed for the committed changes.

Summary by cubic

Classifies runtime errors end-to-end and preserves their kind across activities, the scheduler, and workflows to improve error clarity and retries. Adds feature-flagged DSLWorkflowV2, consolidates error details into a single wrapper, and standardizes activity/workflow error boundaries.

New Features
- Added runtime error envelopes (kind/origin/phase) in tracecat.runtime.errors and Temporal helpers in tracecat.temporal.errors (ActivityRuntimeError, WorkflowRuntimeError, TemporalErrorDetails, extract helpers).
- Consolidated ApplicationError details into TemporalErrorDetails.v1 with payloads and a per-ref runtime_errors map; malformed details are ignored.
- Standardized error boundaries via tracecat.temporal.activity_errors and tracecat.dsl.activity_errors to classify user/platform/infra errors and attach envelopes across storage, tiers/workspaces, workflow management/schedules, agents/sessions/presets, interactions.
- Propagated envelopes across activity/workflow boundaries and into scheduler task exceptions; wrappers preserve activity classification; executor uses tracecat.executor.errors.ActionRuntimeError and retries infra errors by default.
- Introduced DSLWorkflowV2 behind dsl-workflow-v2; workers register both; new starts/schedules route via dsl_workflow_run_method_for_new_execution; history/event parsing accepts both via is_dsl_workflow_type_name; frontend flag enums updated.
Bug Fixes
- Preserved materialization retry semantics; honored workflow-scoped runtime errors for retries/failure; preserved scheduler error payloads and runtime envelopes in task exceptions.
- Classified common failures as non-retryable user/platform errors (registry sync validation, workspace org missing, invalid concurrency caps, agent/preset/model/session lookups, interaction/session not found).
- Replaced legacy temporal.exceptions.UserError with typed runtime errors; webhooks now use a generic workflow handle and return StoredObject.
- CI: capped pytest xdist workers to 15 to avoid Redis DB collisions; simplified a one-off scheduler error message.

^{Written for commit 0b33755. Summary will update on new commits. Review in cubic}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4117a02525

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cubic-dev-ai

4 issues found across 40 files

Confidence score: 3/5

There is a concrete regression risk in tracecat/dsl/workflow.py: runtime-only ApplicationError details no longer match the error-handler parser’s expected ActionErrorInfo map shape, which can break handler dispatch and obscure the original workflow failure.
tracecat/runtime/errors.py may miss implicitly chained infra exceptions unless __context__ is traversed, creating medium risk of misclassification and harder diagnosis during failures.
packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py and tracecat/dsl/action.py introduce behavior changes that can amplify failure impact (retrying deterministic validation via activity:fail_slow, and bypassing ActivityRuntimeError wrapping when storage init fails), so this is mergeable but with notable runtime-risk areas.
Pay close attention to tracecat/dsl/workflow.py, tracecat/runtime/errors.py, packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py, tracecat/dsl/action.py - error-path compatibility and classification need validation to avoid masked failures and incorrect retries.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tracecat/dsl/workflow.py">

<violation number="1" location="tracecat/dsl/workflow.py:579">
P1: New runtime-only ApplicationError details are incompatible with the existing error-handler parsing logic, which expects ActionErrorInfo maps. This can break error-handler workflow dispatch and mask the original workflow failure.</violation>
</file>

<file name="tracecat/runtime/errors.py">

<violation number="1" location="tracecat/runtime/errors.py:175">
P2: Include `__context__` in exception-chain traversal; otherwise implicitly chained infra exceptions can be missed and misclassified.</violation>
</file>

<file name="packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py">

<violation number="1" location="packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py:484">
P2: Using `activity:fail_slow` here causes deterministic subagent validation failures to retry multiple times instead of failing immediately.</violation>
</file>

<file name="tracecat/dsl/action.py">

<violation number="1" location="tracecat/dsl/action.py:744">
P2: `get_object_storage()` is outside the error-classification try block, so backend initialization failures bypass `ActivityRuntimeError` wrapping and lose runtime error classification.</violation>
</file>

_{Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e41beddf65

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf698c9cb2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d26370ccd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2bea4ff8e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

zeropath-ai · 2026-05-16T23:24:09Z

✅ No security or compliance issues detected. Reviewed everything up to 0b33755.

Security Overview

🔎 Scanned files: 50 changed file(s)
🔗 Scan Link: https://zeropath.com/app/repositories/00dffd6c-8834-4dc9-b6d8-b44cd1622986?scanId=80b99201-6270-40a2-bc59-494f4dfa0760&codeScanTypes=PrScan&tab=issues

Detected Code Changes

The diff is too large to display a summary of code changes.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5eb46529b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cubic-dev-ai · 2026-05-21T03:10:06Z

You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment @cubic-dev-ai review.

daryllimyt added 7 commits May 15, 2026 12:37

feat: add runtime error envelopes and workflow v2

fac62e1

fix: preserve materialization error retry semantics

de6e6f1

fix: make agent config failures retry-aware

1fb6771

fix: preserve scheduler activity failures

85c6609

refactor: classify activity runtime errors

e18fe52

fix: preserve activity runtime classification in workflow wrapper

5a77bb1

fix: classify workflow-originated trigger errors

4117a02

This comment has been minimized.

Sign in to view

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/temporal/errors.py Outdated

cubic-dev-ai Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/dsl/workflow.py

Comment thread tracecat/runtime/errors.py Outdated

Comment thread packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py

Comment thread tracecat/dsl/action.py Outdated

daryllimyt added 2 commits May 16, 2026 18:13

fix: preserve scheduler error payloads

e41bedd

fix: classify runtime metadata boundaries

2038df5

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/dsl/scheduler.py

daryllimyt added 2 commits May 16, 2026 18:23

fix: honor workflow-affecting runtime errors

cf698c9

ci: cap python test xdist workers

d38b15d

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/dsl/workflow.py Outdated

daryllimyt added 2 commits May 16, 2026 18:34

test: expect typed workflow definition errors

666e275

fix: avoid runtime error detail key collisions

7d26370

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/executor/errors.py Outdated

fix: retry executor infra errors by default

2bea4ff

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread tracecat/dsl/scheduler.py Outdated

fix: validate runtime error detail wrappers

60f661a

daryllimyt added 2 commits May 19, 2026 12:59

fix: update return type for workflow handle to StoredObject

22ca3c8

Use typed errors for trigger input normalization

b5eb465

chatgpt-codex-connector Bot reviewed May 19, 2026

View reviewed changes

Comment thread tracecat/temporal/errors.py Outdated

daryllimyt added 3 commits May 20, 2026 14:28

Simplify runtime error detail handling

787de8b

Clean up runtime error boundaries

08658a6

Consolidate runtime activity error boundaries

912a46d

Inline single-use scheduler error message

0b33755

Conversation

daryllimyt commented May 16, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Description

Related Issues

Screenshots / Recordings

Steps to QA

Summary by cubic

Uh oh!

This comment has been minimized.

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

zeropath-ai Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

cubic-dev-ai Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

daryllimyt commented May 16, 2026 •

edited by cubic-dev-ai Bot

Loading

zeropath-ai Bot commented May 16, 2026 •

edited

Loading