-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nodeadm: block until daemon status changes are reflected #1965
Conversation
/ci |
@ndbaker1 the workflow that you requested has completed. 🎉
|
/ci |
@ndbaker1 the workflow that you requested has completed. 🎉
|
@@ -55,7 +55,7 @@ func (m *systemdDaemonManager) GetDaemonStatus(name string) (DaemonStatus, error | |||
if err != nil { | |||
return DaemonStatusUnknown, err | |||
} | |||
switch status.Value.String() { | |||
switch status.Value.Value().(string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not understanding this change. If our existing code was mishandling this, wouldn't we always get DaemonStatusUnknown
back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we didn't have any callers of GetDaemonStatus
before this, but when i tried the original code i did just get unknown
for everything. so it just never worked 🫠
per comments in the overview, the previous impl. just prints the value's pointer address as a string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha gotcha, didn't see that this func was unused previously 👍
nodeadm/internal/daemon/systemd.go
Outdated
@@ -102,3 +110,19 @@ func (m *systemdDaemonManager) Close() { | |||
func getServiceUnitName(name string) string { | |||
return fmt.Sprintf("%s.service", name) | |||
} | |||
|
|||
func (m *systemdDaemonManager) waitForStatus(ctx context.Context, unitName string, targetStatus DaemonStatus) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much cleaner, thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @ndbaker1 can you make sure a fresh CI run goes green before you merge?
/ci |
@ndbaker1 the workflow that you requested has completed. 🎉
|
should be the last one 🤞 /ci |
fyi @ndbaker1 you should be able to just use slash |
@ndbaker1 the workflow that you requested has completed. 🎉
|
Issue #, if available:
more general attempt than #1942 at solving the related issue
Description of changes:
revamp the retrier helper a bit and block on daemon
EnsureRunning
impls to guarantee the service is up before continuing with the rest of the bootstrap. This makes more sense for anEnsureRunning
api, which only returns once the process is currently running, rather than just initiated.GetDaemonStatus
was implemented for systemd. The current implementation was using the string value of a pointer rather than the value of the systemd property.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Testing Done
built a dev ami and verified completion of the bootstrap. also logged the
GetDaemonStatus
calls on a dev ami, and recorded 10 second transition fromactiviting
toactive
for bothcontainerd
andkubelet
.This increases the total time-to-node-join metric, but would reduce potential retries in other parts of the bootstrap where connectivity to containerd or etc is required.
nodeadm test logs
See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.