Skip to content

Refactor error #770

Closed
Closed
@cdavernas

Description

@cdavernas

What would you like to be added?

Refactor errors, like discussed many times over the years 😄

Why is this needed?

To use a well-defined schema that provides means to properly and consistently describe a workflow and/or runtime related error, which can then be caught and optionally acted upon in a declarative way.

What is your proposal?

Use the ProblemDetails (RFC7807) to describe errors in a well-known, specification-based way.

Even though it is initially thought for HTTP APIs, I do not see the harm of using it 'as is', given the fact it provides IMHO all we need to describe workflow/runtime errors, and even provides means for us to define and describe generic SW error types.

Here is an example of an hypothetical error thrown by the first action of the second state of a workflow, as described by RFC7807:

type: 'https://serverlessworkflow.io/problems/types/function-not-defined'
status: 404
title: 'NotFound'
instance: '/states/1/actions/0'
detail: The referenced function 'get-id' is not defined

The type (required) is an uri used to reference errors. It would allow us defining well-known ServerlessWorkflow errors, while preserving the possibility to both users and integrators to define their own. In other words, should it be real or not, absolute or relative, this URI allows referencing in a structured fashion a specific type of error.

The status (required) allows us to further describe the error by providing an integer we can select on in handlers. In RFC, it is used to reference the response's HTTP status code, but we perfectly could choose not to restrict to those (even though they do the job of describing an application error very well IMO), as the spec defines it as type number with no further limitations.

The title (required) is a short, human-readable summary of the problem type. In our case, it's just a nice to have.

The instance (optional) is a an uri that identifies the specific occurrence of the problem. In our case, we could use it to store a reference to the workflow "component" that has produced the error (i.e. JsonPointer is used, and seems a perfect fit for that role IMHO).
For example, in case of an error produced by the fourth actions of the second state of an hypothetical workflow: /states/1/actions/3

The detail (optional) is human-readable explanation specific to an occurrence of the described problem. In our case, it's yet another nice to have.

If we choose to proceed with proposal, we should also remove the top-level errors property, as it would have been made obsolete.
As a matter of fact, the action schema would have to be updated to include a reference to an actual error (instead of referencing the definition of an error).

What I propose is to to something like:

actions:
  - functionRef: my-function
    retryRef: retry-strategy
    nonRetryableErrors:
      - status: 404 #we retry explicitly if status is 404
      - status: 503 #we retry explicitly if status is 503...
        type: https://fake.runtime/errors/runner-not-ready #... and has specified type
      - status: ${ . == 401 or . == 403 } #we retry if status is 401 or 403...
        instance: '/states/1/*' #it would be cool to support wildcard for those cases #... and has been produced by anything belonging to the second state
     - condition: ${ .status == 500 } #we retry if condition matches the caught error

Metadata

Metadata

Assignees

Labels

area: specChanges in the Specificationchange: featureNew feature or request. Impacts in a minor version change

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions