Skip to content

Comments

Add step-level retry policies#294

Merged
jamescmartinez merged 6 commits intomainfrom
step-retries
Feb 13, 2026
Merged

Add step-level retry policies#294
jamescmartinez merged 6 commits intomainfrom
step-retries

Conversation

@jamescmartinez
Copy link
Contributor

@jamescmartinez jamescmartinez commented Feb 13, 2026

Adds per-step retry policies. Step failure backoff/budget decisions now use the step's own retry policy rather than the workflow's.

Example code:

await step.run(
  {
    name: "charge-card",
    retryPolicy: {
      initialInterval: "1s",
      backoffCoefficient: 2,
      maximumInterval: "30s",
      maximumAttempts: 5,
    },
  },
  async () => {
    await payments.charge();
  },
);

Copilot AI review requested due to automatic review settings February 13, 2026 15:29
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds step-level retry policies to OpenWorkflow so individual step.run(...) calls can control backoff/attempt limits independently of the workflow-level retry policy, with backend support and documentation updates.

Changes:

  • Introduces step-level retryPolicy override on step.run(...) and tracks retry budgets per stepName during execution.
  • Adds Backend.rescheduleWorkflowRunAfterFailedStepAttempt(...) and implements it for SQLite and Postgres backends.
  • Expands test coverage (worker + execution + backend testsuite) and updates docs/architecture to describe the new behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/openworkflow/execution.ts Implements step retry policy resolution + per-step failed-attempt counting; reschedules runs via new backend method on step failures.
packages/openworkflow/backend.ts Extends Backend interface with rescheduleWorkflowRunAfterFailedStepAttempt params/method.
packages/openworkflow/sqlite/backend.ts Adds SQLite implementation of rescheduleWorkflowRunAfterFailedStepAttempt.
packages/openworkflow/postgres/backend.ts Adds Postgres implementation of rescheduleWorkflowRunAfterFailedStepAttempt.
packages/openworkflow/backend.testsuite.ts Adds tests validating backend reschedule behavior and worker-ownership enforcement.
packages/openworkflow/worker.test.ts Adds worker-level tests for step-default retries, overrides, per-step budget isolation, and lease/sleep interactions.
packages/openworkflow/execution.test.ts Adds unit tests for createStepExecutionStateFromAttempts.
packages/docs/docs/steps.mdx Documents per-step retryPolicy usage in step.run.
packages/docs/docs/retries.mdx Clarifies retry policy shape and separates step vs workflow retry policy descriptions.
packages/docs/docs/advanced-patterns.mdx Updates wording referencing step-level retry configuration.
ARCHITECTURE.md Updates architecture docs to reflect per-step retry budgeting and separation from workflow-level retry policy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +405 to +413
if (error instanceof StepError) {
const serializedError = serializeError(error.originalError);
const retryDecision = computeFailedWorkflowRunUpdate(
error.retryPolicy,
error.stepFailedAttempts,
workflowRun.deadlineAt,
serializedError,
new Date(),
);
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When multiple steps fail in the same execution slice (e.g., Promise.all), only the first surfaced StepError is used to compute availableAt. This can violate other failed steps' retry policies (retrying too early) because there is a single workflow-level reschedule time but potentially multiple per-step backoff requirements. Consider aggregating failed step retry decisions for the current run and rescheduling to the latest required availableAt (or otherwise defining/encoding a deterministic rule), potentially by persisting each step’s resolved retry policy/config so it can be recomputed from history.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 13, 2026 17:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jamescmartinez jamescmartinez merged commit 21e28ce into main Feb 13, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant