-
Notifications
You must be signed in to change notification settings - Fork 54
thar-be-settings: improve service restart behavior #802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Kush Upadhyay <kushupad@amazon.com>
f0e2d8a to
2283cba
Compare
|
^ Force push fixes a minor fmt warning from builds |
| fn restart_non_systemd(&self) -> Result<()> { | ||
| for restart_command in &self.model.restart_commands { | ||
| // Skip systemd commands - handled by Services::restart_systemd() | ||
| if restart_command.contains("systemctl") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want contains or starts_with?
| /// Execute non-systemd restart commands for this service. | ||
| fn restart_non_systemd(&self) -> Result<()> { | ||
| for restart_command in &self.model.restart_commands { | ||
| // Skip systemd commands - handled by Services::restart_systemd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of a shame we have to sort this twice
| .0 | ||
| .values() | ||
| .flat_map(|s| s.model.restart_commands.iter()) | ||
| .filter(|cmd| cmd.contains("systemctl")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as below, why not starts_with if we're assuming the first word is the command?
| for (name, service) in &services.0 { | ||
| if let Err(e) = service.restart_non_systemd() { | ||
| error!("Failed to restart {}: {}", name, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems cleaner to partition Services into two sets, one for systemd and one for non-systemd. These could be represented by different types and each could have a restart method.
| debug!("Restarting systemd units: {:?}", &units); | ||
| let mut cmd = Command::new("/bin/systemctl"); | ||
| cmd.arg("try-reload-or-restart"); | ||
| cmd.args(&units); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't automatically treat all systemctl ... commands as wanting systemctl try-reload-or-restart ... instead.
We should assume the restart command must be issued exactly as specified. The only optimization permitted is to group the restart commands by subcommand - all systemctl restart ... together, all systemctl try-restart ... together, and so on.
It might also be prudent to lean into the non-determinism for restarts - for example, preparing the entire batch of restart commands, then randomizing the order. There may be some brittleness from potential ordering surprises that needs to be shaken out.
Issue number:
Related to: bottlerocket-os/bottlerocket#4556
Description of changes:
Improves
thar-be-settingsservice restart behavior with:Batch systemd restarts: All systemd units are now restarted with a single
systemctl try-reload-or-restartcall, making the restart atomic from systemd's perspective.No early bail: All service restarts are attempted regardless of individual failures so the system is not left in a partially-configured state. Failures are logged as they occur, and a summary error is returned at the end.
There are various ways to implement this, but I focused on keeping
restart_servicesas simple as possible. This meant making symmetrical "flows" for systemd and non-systemd services so the restart process is simply:Since systemd services are batched,
restart_systemdmethod needs to be part of theServices impl. I removed theServiceRestarttrait since it was not used for abstraction (only one implementor) and moved the individual service restart logic torestart_non_systemdunderService impl.Testing done:
Launched an
aws-devami withthar-be-settingsdebug logs enabled and a forced failure incorndogto demonstrate the no-bail behavior. Instance failed boot as expected since settings failed to apply, but journal logs show:Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.