Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run "rke2-killall.sh" before shutdown #270

Closed

Conversation

johnliu55tw
Copy link
Contributor

@johnliu55tw johnliu55tw commented Apr 8, 2022

Create a rke2-shutdown.service to run script rke2-killall.sh to stop all the running pods (containers) before system shutdown or reboot. One of the benefits is that system shutdown and reboot would be much faster.

The key component is the Before=umount.target so that the service will run rke2-killall.sh before systems started to unmount everything, including those Pods' volumes.

Test plan

  1. Deploy Harvester (one node would be sufficient). Wait for it to be Ready
  2. Switch to Shell by pressing F12
  3. Initiate the graceful shutdown process, e.g. by running the command poweroff
  4. The system should be shut down fairly quickly compared to the old version of Harvester. One way to tell the difference is that in an older Harvester, the terminal would show the following message while shutting down:
    image
    and the shutdown process would be blocked for around 1 minute. With this PR, those messages will not show at all.
  5. You can further test reboot command or operation such as Press power button to initiate the graceful shutdown/reboot process, and they all shouldn't be affected by the message shown above.

TODOs

  • Upgrade: Validated that when upgrading from an older version of Harvester, the rke2-shutdown.service will be installed and enabled.

Related

@johnliu55tw johnliu55tw marked this pull request as ready for review April 11, 2022 07:03
Copy link
Contributor

@tjjh89017 tjjh89017 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Before=reboot.target halt.target shutdown.target poweroff.target umount.target final.target

[Install]
RequiredBy=reboot.target halt.target shutdown.target poweroff.target umount.target final.target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test and work well.
Is this unit required to be run after rke2-agent or rke2-server service is stopped?

Copy link
Contributor Author

@johnliu55tw johnliu55tw Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if it's a user that stopped the service, the user should decide whether rke2-killall.sh should be run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I found some discussions about this issue, and seems like "stopping K3s/RKE2 does not stop running containers" is specifically designed to allow non-disruptive upgrade of K3s/RKE2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During an upgrade, we will always wait for RKE2 upgraded first and shut down later.

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/local/bin/rke2-killall.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also rke2-killall.sh sends KILL (-9) signals to processes, but rke2-server/agent's systemd services's ExecStopPost option sends TERM signals. Just wondering if it's safer to use TERM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using TERM might be a good idea to give the processes some time to clean up and terminate.
but we need to have another script to do it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using SIGTERM would be much safer. Do you think maybe we should create a copy of rke2-killall.sh and replace the -9 with -15?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's oneliner, we can also embed it in the systemd unit file. There are many cleanups in the rke2-killall.sh script, maybe some are unnecessary.

Create a service to run script `rke2-killall.sh` to stop all the
running containers.

Signed-off-by: John Liu <john.liu@suse.com>
@tjjh89017 tjjh89017 marked this pull request as draft June 20, 2022 03:15
@bk201
Copy link
Member

bk201 commented Apr 23, 2024

Thanks, John. This was addressed in #480, so close it.

@bk201 bk201 closed this Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants