-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record more started time/errors for retried activities #1873
Comments
During using the Cadence, we found one activity will fail to schedule on all the machines, the last time it fails by timeout. But the failure records in the process will help debugging the issue. |
Hi all, is there any progress on this issue? We're looking at adopting Cadence, in part as a potential replacement for our existing background workers. Currently, if an activity fails and is retried, there is no way to see what the error message is that caused the failure which makes it difficult to debug. This feature is something that comes out of the box with traditional job workers like Sidekiq or Celery. |
Exposes the lastFailureReason and lastFailureDetails, which exist in the mutable state, through the ActivityTaskStartedEvent in the history. The fields were added to ActivityTaskStartedEvent as part of uber/cadence-idl#15 Resolves: #1873
* Record error info for retried activities (#1873) Exposes the lastFailureReason and lastFailureDetails, which exist in the mutable state, through the ActivityTaskStartedEvent in the history. The fields were added to ActivityTaskStartedEvent as part of uber/cadence-idl#15 * Add tests to verify error info in retried activities
* Record error info for retried activities (#1873) Exposes the lastFailureReason and lastFailureDetails, which exist in the mutable state, through the ActivityTaskStartedEvent in the history. The fields were added to ActivityTaskStartedEvent as part of uber/cadence-idl#15 * Add tests to verify error info in retried activities
Right now we only record the last started time when writing down the started event. This is super confusing for customers even for us.
We don't have to record all of them, but it will be nice to have a limit like at most 5 of them can be recorded down.
The text was updated successfully, but these errors were encountered: