Description
Currently, there are two approaches to how we can handle errors in orchestrations.
- Throw the error
This causes the function runtime to register that the orchestration execution actually failed. This is positive from an AppInsights + customer visibility perspective.
The downside is that if we throw the error, we don't get the replay events sent back to the C# extension, meaning that we can't replay actions in the extension. This leads to a non-determinism error from the Durable Task perspective.
- Send the error as data
This is the approach Durable Python is currently taking. This is a relatively clean solution, but unfortunately, it means that the function execution that encountered the exception is marked as completed. This is very confusing from a developer perspective, as in App Insights it will show that the orchestration function succeeded.
The way we worked around this in Durable JavaScript is via a relatively hacky approach, but it is the only way that meets all of our criteria.
- Embed replay data inside of the error we throw
We still throw an error, but we wrap the error in a custom error, that also embeds the replay data as JSON in the error message. This accomplishes both goals of marking the orchestration execution as failed from the functions runtime perspective, as well as having the correct error message in the Durable Task perspective.
You can see the way JS implemented it here Azure/azure-functions-durable-js#145.
This approach is not perfect. It is fairly fragile, and it definitely makes the exception message that customers will see in their app insights far messier. But it is the only approach we can currently take without making some changes to how Functions handles out-of-process errors.