-
Notifications
You must be signed in to change notification settings - Fork 79
Add history events for NexusCancelRequest[Completed|Failed] #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| NexusOperationCancelRequestCompletedEventAttributes nexus_operation_cancel_request_completed_event_attributes = 61; | ||
| NexusOperationCancelRequestFailedEventAttributes nexus_operation_cancel_request_failed_event_attributes = 62; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these be marked as worker_may_ignore? I am curious about the backwards-compatible properties from an SDK POV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will need to be
| NexusOperationCancelRequestCompletedEventAttributes nexus_operation_cancel_request_completed_event_attributes = 61; | ||
| NexusOperationCancelRequestFailedEventAttributes nexus_operation_cancel_request_failed_event_attributes = 62; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify what happens on current server version on Nexus cancel requested? Will any changes we make server side still work with older SDKs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we record a NEXUS_OPERATION_CANCEL_REQUESTED event when we receive the command. Then we actually send the request to the handler. We do not record the operation as cancelled until the completion callback is invoked with a cancelled error. None of this behavior will change. We will just start recording a history event after the handler responds to the cancel request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, so this means that we are effectively just sending more events that we usually would for the same situation, and if we properly mark as worker_may_ignore, we're good.
My only concern now is from the SDK side, how is an SDK that is supposed to wait for this event supposed to know it is never coming on an older server? So at least for Core-based SDKs, the default cancel type for activities is "try cancel" (i.e. fire and forget) and the default cancel type for child workflows is "wait cancel completed". I think Nexus operations are going to have to be "try cancel" by default to make "wait cancel completed" an opt-in behavior since it is not supported on all GA Nexus servers. In Go and Java, I think "try cancel" is effectively the default for activities, and "wait cancel requested" or "try cancel" is for child workflows, but unsure exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently SDKs are doing "wait cancel completed" which would continue to be the default. These new events are for supporting the "try cancel" case. When a newer SDK is talking to an older server, it will never receive these new events and will effectively fall back to "wait cancel completed". I think this behavior is okay, do you see an issue with it? I believe this is the same as how child workflows are handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see...
I think this behavior is okay, do you see an issue with it?
No issue if we make the SDK-side default of this new enum to be "wait cancel completed" which we've done for child workflows in Core-based SDKs, but never have done in Go and Java based on my quick reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I misspoke. These events are for "wait cancel requested" and "try cancel" won't be waiting on anything from the server. So it is okay to make either the default but I still lean toward "wait cancel completed" since this is the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Whatever is the default today I think we need to make the default when we make it configurable in SDKs for compat reasons.
rodrigozhou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit comments.
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported | ||
| // with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported | |
| // with. | |
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported. |
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported | ||
| // with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported | |
| // with. | |
| // The `WORKFLOW_TASK_COMPLETED` event that the corresponding RequestCancelNexusOperation command was reported. |
## What changed? <!-- Describe what has changed in this PR --> Added support for the new `NexusOperationCancelRequestCompleted` and `NexusOperationCancelRequestFailed` events which were added in temporalio/api#564 These events are marked with `WorkerMayIgnore=true` for compatibility with older SDKs. ## Why? <!-- Tell your future self why have you made these changes --> These events are used to support the new "wait cancellation requested" Nexus operation cancellation type. ## How did you test it? <!-- How have you verified this change? Tested locally? Added a unit test? Checked in staging env? --> Updated functional tests.
What changed?
Added two new history events:
NexusOperationCancelRequestCompletedandNexusOperationCancelRequestFailedWhy?
To support Nexus operation cancellations that should wait only until the cancellation has been delivered to the handler.
Breaking changes
No
Server PR