Skip to content

Propagate durable activity failures to orchestrator function #571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 22, 2020

Conversation

AnatoliB
Copy link
Contributor

@AnatoliB AnatoliB commented Dec 19, 2020

Resolves #572

There are two problems to solve here:

  1. Make Invoke-ActivityFunction report an error on activity failure. This part is simple: TaskFailed events are recorded in the orchestration history, so we just make Invoke-ActivityFunction write an error (ActivityFailureException) to the Error stream. By default, this is treated as a non-terminating error in PowerShell, but the orchestrator function author can use either $ErrorActionPreference = 'Stop' or -ErrorAction Stop to convert it to an exception, if desired.

  2. Unfortunately, completing (1) is not enough: throwing this exception from the orchestrator function leads to errors complaining about non-deterministic orchestrator behavior because of certain implementation quirks affecting out-of-proc language workers. This PR follows the approach taken by the JavaScript Durable implementation (Fix nondeterminism on Activity/Suborchestration failures azure-functions-durable-js#145): the orchestration replay state is embedded into the exception message after the $OutOfProcData$ marker.

@AnatoliB AnatoliB changed the title Propagate durable action failures Propagate durable activity failures to orchestrator function Dec 19, 2020
@AnatoliB AnatoliB marked this pull request as ready for review December 21, 2020 06:02
Copy link
Contributor

@Francisco-Gamino Francisco-Gamino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments. Other than that, LGTM.

// - https://github.com/Azure/azure-functions-durable-js/pull/145
// - https://github.com/Azure/azure-functions-durable-extension/pull/1171
var orchestrationMessage = new OrchestrationMessage(isDone: false, new List<List<OrchestrationAction>> { actions }, output: null, exception.Message);
var message = $"{exception.Message}{OutOfProcDataLabel}{JsonConvert.SerializeObject(orchestrationMessage)}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is value orchestrationMessage always a valid JSON?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note we are converting to JSON, not from JSON. orchestrationMessage is an object, and the result of this conversion is expected to be a valid JSON (assuming Newtonsoft.Json is correct).

Copy link
Contributor

@Francisco-Gamino Francisco-Gamino Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct Anatoli, I missed read it. My comment does not apply. Resolving.

{
case HttpStatusCode.Accepted:
{
if (DateTime.UtcNow > startTime + orchestrationCompletionTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Extra tab before the if statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

var thrownException = Assert.Throws<OrchestrationFailureException>(() => InvokeOrchestration(completed: true, output));
Assert.Same(originalException, thrownException.InnerException);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a test case that validates that the FullyQualifiedErrorId contains Functions.Durable.ActivityFailure? I can see this becoming a popular search online later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We cannot validate this right here because FullyQualifiedErrorId does not belong to the exception. It's the other way around: the exception is wrapped into an ErrorRecord, and this ErrorRecord contains FullyQualifiedErrorId... But I agree we should emphasize this because this errorId will become a part of the public contract that we shouldn't break in the future. I think this check should go to DurableActivityErrorHandler unit tests, I'll do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Thanks Anatoli.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Durable Function activity failures are difficult to handle
2 participants