-
Notifications
You must be signed in to change notification settings - Fork 19
Update graceful worker shutdown #627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
features/activity/shutdown/README.md
Outdated
a cancel. It must be possible for the user to determine if this cancel was issued | ||
by server, or is the result of worker shutdown | ||
* When a worker shutdown is initiated, the activity context isn't canceled until shutdown | ||
has succeeded or the shutdown timeout has elapsed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should call this "graceful shutdown timeout" so it's not confused with some actual timeout for shutting down?
features/activity/shutdown/README.md
Outdated
* When a worker shutdown is initiated, the activity context isn't canceled until shutdown | ||
has succeeded or the shutdown timeout has elapsed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to clarify that this shutdown timeout defaults as 0/no-timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good idea to clarify this
features/activity/shutdown/README.md
Outdated
* It must be possible for the user to determine whether the activity context cancel was the result of worker shutdown, | ||
or issued by server | ||
* TODO: Java - need to add a way for Activities to know when the worker is being shutdown. | ||
* TODO: Typescript - figure out what happens in TS | ||
* Activities may handle the cancel however they like (including continuing running) | ||
* Heartbeating is possible while the shutdown process is ongoing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heartbeating may notify the activity the worker is shutting down, but there should be an option to continue heartbeating will the graceful shutdown process is ongoing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for Core-based SDKs, heartbeating does not notify about the grace period, that is delivered in a separate manner, and heartbeating continues to work regardless of activity status (during or after grace period) IIUC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is why I put the "may" in there
features/activity/shutdown/README.md
Outdated
* It must be possible for the user to determine whether the activity context cancel was the result of worker shutdown, | ||
or issued by server | ||
* TODO: Java - need to add a way for Activities to know when the worker is being shutdown. | ||
* TODO: Typescript - figure out what happens in TS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TS: The activity will receive a cancellation with message 'WORKER_SHUTDOWN'
. See this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
features/activity/shutdown/README.md
Outdated
* When a worker shutdown is initiated, the activity context isn't canceled until the | ||
graceful shutdown timeout has elapsed | ||
* Graceful shutdown language behavior | ||
* Core based SDKs - when graceful shutdown timeout isn't specified, that is treated as no-timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that Core defaults to no-timeout.
However, TS actually defaults the graceful shutdown timeout to 0. From a quick look, it seems like Python, .Net and Ruby are all doing the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No timeout and 0 are effectively the same I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not quite, if you remember from a few weeks (or months ago), Spencer mentioned when graceful_shutdown_period
is not set, there is no timeout, so cancels are never issued to activities.
But sounds like from James looking at the core SDKs, each SDK will instead send 0 when the language specific timeout isn't set, meaning no SDK actually has default == no-timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, we only made that one optional for backward compatibility with TS back then, because the TS Worker was emitting activity cancellation by itself and we didn't want to shake the TS worker's logic at that moment. So we instead made it possible for the TS SDK to disable Core's activity cancellation on shutdown.
TS is now relying on Core for that behavior, just like other SDKs, so I think it would be safe to make that option mandatory in Core's worker config. Definitely not high priority and very low impact, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 So from a user POV, "no timeout and 0 are effectively the same"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "no timeout" here means not specifying "graceful shutdown", then yes
features/activity/shutdown/README.md
Outdated
* It must be possible for an activity to determine that a worker is being shut down even before graceful timeout elapses | ||
* It must be possible for the user to determine whether the activity context cancel was the result of worker shutdown, | ||
or issued by server | ||
* TODO: Java - need to add a way for Activities to know when the worker is being shutdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Quinn-With-Two-Ns can you remind me, is this still a TODO? I can't remember when/where you last mentioned this.
If it's still true, I can create an GH issue and link the issue to this TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think temporalio/sdk-java#1005 (comment) covers this too
What was changed
Updated existing graceful worker shutdown spec
Why?
Existing spec was outdated
Checklist
Closes [Feature Request] Write spec for graceful worker shutdown #261
How was this tested:
Tests will be added in a separate Issue/PR, see parent issue for more details
Any docs updates needed?