Skip to content

Add documentation for the jobs failure policy #4676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: v1.16
Choose a base branch
from

Conversation

acroca
Copy link

@acroca acroca commented Jun 13, 2025

Thank you for helping make the Dapr documentation better!

Please follow this checklist before submitting:

  • Commits are signed with Developer Certificate of Origin (DCO - learn more)
  • Read the contribution guide
  • Commands include options for Linux, MacOS, and Windows within codetabs
  • New file and folder names are globally unique
  • Page references use shortcodes instead of markdown or URL links
  • Images use HTML style and have alternative text
  • Places where multiple code/command options are given have codetabs

In addition, please fill out the following to help reviewers understand this pull request:

Description

Document the use of FailurePolicy for jobs. This new field has been introduced in this PR

Issue reference

@acroca acroca requested review from a team as code owners June 13, 2025 09:42
@acroca acroca force-pushed the jobs-failure-policy branch from 8ffe3db to 17442bc Compare June 13, 2025 09:56
`failure_policy` specifies how the job should handle failures.

It can be set to `constant` or `drop`.
- The `constant` policy will retry the job up to `max_retries` times, with a delay of `interval` between retries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add that if max_retries is not set, it will retry forever.

@@ -37,6 +37,7 @@ Parameter | Description
`dueTime` | An optional time at which the job should be active, or the "one shot" time, if other scheduling type fields are not provided. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from creation time), or non-repeating ISO8601.
`repeats` | An optional number of times in which the job should be triggered. If not set, the job runs indefinitely or until expiration.
`ttl` | An optional time to live or expiration of the job. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from job creation time), or non-repeating ISO8601.
`failure_policy` | An optional failure policy for the job. Details of the format are below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to this line what the default value is when unset.

Please can we add a new table with the full failure_policy API definitions- add what the defaults are when fields are unset.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't add a table, but different bullet points for the two fields of the constant policy. Do you think this is good enough? I find a table might be confusing given that there are two types of policies, and one of them doesn't have any configuration.

@acroca
Copy link
Author

acroca commented Jun 18, 2025

@JoshVanL shouldn't this PR target the v1.16 branch instead? If I'm not mistaken, the change will only be available in 1.16

Copy link
Member

@msfussell msfussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few grammar changes.

@@ -37,6 +37,7 @@ Parameter | Description
`dueTime` | An optional time at which the job should be active, or the "one shot" time, if other scheduling type fields are not provided. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from creation time), or non-repeating ISO8601.
`repeats` | An optional number of times in which the job should be triggered. If not set, the job runs indefinitely or until expiration.
`ttl` | An optional time to live or expiration of the job. Accepts a "point in time" string in the format of RFC3339, Go duration string (calculated from job creation time), or non-repeating ISO8601.
`failure_policy` | An optional failure policy for the job. Details of the format are below. If not set, the job will be retried up to 3 times with a delay of 1 second between retries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`failure_policy` | An optional failure policy for the job. Details of the format are below. If not set, the job will be retried up to 3 times with a delay of 1 second between retries.
`failure_policy` | An optional failure policy for the job. Details of the format are below. If not set, the job is retried up to 3 times with a delay of 1 second between retries.


It can be set to `constant` or `drop`.
- The `constant` policy will retry the job based on the configuration
- `max_retries` configures how many times the job should be retried. Not setting this will make it retry indefinitely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `max_retries` configures how many times the job should be retried. Not setting this will make it retry indefinitely.
- `max_retries` configures how many times the job should be retried. Not setting this makes it retry indefinitely.

`failure_policy` specifies how the job should handle failures.

It can be set to `constant` or `drop`.
- The `constant` policy will retry the job based on the configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The `constant` policy will retry the job based on the configuration
- The `constant` policy retries the job based on the configuration

@msfussell msfussell added this to the 1.16 milestone Jun 26, 2025
@JoshVanL
Copy link
Contributor

@JoshVanL shouldn't this PR target the v1.16 branch instead? If I'm not mistaken, the change will only be available in 1.16

@acroca yep, that's right 🙂

@acroca acroca changed the base branch from v1.15 to v1.16 June 30, 2025 12:49
acroca added 3 commits June 30, 2025 14:58
Signed-off-by: Albert Callarisa <albert@diagrid.io>
…n the constant policy

Signed-off-by: Albert Callarisa <albert@diagrid.io>
Signed-off-by: Albert Callarisa <albert@diagrid.io>
@acroca acroca force-pushed the jobs-failure-policy branch from 17751d3 to 461c37f Compare June 30, 2025 13:00
Copy link
Member

@msfussell msfussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@alicejgibbons alicejgibbons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM a couple of clarifications

`failure_policy` specifies how the job should handle failures.

It can be set to `constant` or `drop`.
- The `constant` policy retries the job based on the configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The `constant` policy retries the job based on the configuration.
- The `constant` policy retries the job constantly with the following configuration options.

It can be set to `constant` or `drop`.
- The `constant` policy retries the job based on the configuration.
- `max_retries` configures how many times the job should be retried. Not setting this makes it retry indefinitely.
- `interval` configures the delay between retries. Not setting this makes it retry immediately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `interval` configures the delay between retries. Not setting this makes it retry immediately.
- `interval` configures the delay between retries. Defaults to retrying immediately. Valid values are of the form `200ms`, `15s`, `2m`, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if all these are valid but this is what we say for other retry policies: https://docs.dapr.io/operations/resiliency/policies/retries/retries-overview/#spec-metadata


It can be set to `constant` or `drop`.
- The `constant` policy retries the job based on the configuration.
- `max_retries` configures how many times the job should be retried. Not setting this makes it retry indefinitely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `max_retries` configures how many times the job should be retried. Not setting this makes it retry indefinitely.
- `max_retries` configures how many times the job should be retried. Defaults to retrying indefinitely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you can set -1 here or 0? If so them would add something like:

"-1 denotes an unlimited number of retries, while 0 means the request will not be retried."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants