-
Notifications
You must be signed in to change notification settings - Fork 80
Worker heartbeat: Remove heartbeat from non-nexus polling #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker heartbeat: Remove heartbeat from non-nexus polling #633
Conversation
Sushisource
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't even need the "reserved" fields since this was never actually done in server.
| reserved "worker_heartbeat"; | ||
| reserved 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... this one's more interesting and we may need to keep. Since we do (will?) have this worker status field, we're going to stop polling on the nexus queue while shutting down... so we need to be able to send the final (or, I guess, semi-final "shutting down") status on this rpc I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we need to be able to send the final status on this rpc
Is having the worker's state on shutdown that important? I'm not against keeping this, especially since this doesn't really cost anything, but I wanna better understand the benefit. Having heartbeat info included on shutdown seems like it's slightly better than just having a worker's info disappear on the next nexus poll, but only if server uses that shutdown heartbeat info?
we're going to stop polling on the nexus queue while shutting down
Yes, but just to be clear, only when we shutdown the last worker on the namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it's important in that we can differentiate "went missing" vs "shut down properly". The former we can alert on and the latter is safe.
However, to your point, this really only matters for the "last" worker to turn off, as the others can send it with the overall poll. That said, it's still important, since I think in 90%+ of cases, there's only going to be one worker total.
|
Without the Is there a way to get around that? |
Yeah, you need to temporarily disable the breaking change check, then reinstate it in a subsequent PR. Ex: https://github.com/temporalio/api/pull/608/files#diff-1a5ba9cba93e971f532139f694d7da802776bfe578e3f753b9c3f25968dbf42dL16 |
|
Got it, thanks for the link |
Sushisource
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
You'll probably want to save a note somewhere that we'll want to test this shutdown case.
_**READ BEFORE MERGING:** All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted._ <!-- Describe what has changed in this PR --> **What changed?** Remove the exception added in #633, add to README explaining this breaking change process <!-- Tell your future self why have you made these changes --> **Why?** keep things working <!-- Are there any breaking changes on binary or code level? --> **Breaking changes** <!-- If this breaks the Server, please provide the Server PR to merge right after this PR was merged. --> **Server PR**
…mporalio#633)" This reverts commit 37b0602.
READ BEFORE MERGING: All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted.
What changed?
Removed the
worker_heartbeatfield from non-nexus callsWhy?
We only want
worker_heartbeatinfo sent on nexus polls, worker shutdown, and dedicated worker heartbeat requests.Breaking changes
Breaking change for a WIP feature, should have no effect. Also tried grepping for this field in server and didn't find it used anywhere.
Server PR