-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed umount error during shutdown #480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me as the EXT4-fs error has yet to be seen so far.
Broadcast message from root@harv-new on pts/0 (Thu 2023-05-18 03:58:04 UTC):
The system is going down for poweroff at Thu 2023-05-18 03:59:04 UTC!
May 18 03:58:04 harv-new sudo[29882]: root : TTY=pts/0 ; PWD=/root ; USER=root ; COMMAND=/usr/sbin/shutdown
May 18 03:58:04 harv-new sudo[29882]: pam_unix(sudo:session): session opened for user root by rancher(uid=0)
May 18 03:58:04 harv-new systemd-logind[1628]: Creating /run/nologin, blocking further logins...
May 18 03:58:04 harv-new sudo[29882]: pam_unix(sudo:session): session closed for user root
May 18 03:58:24 harv-new rancher-system-agent[26512]: time="2023-05-18T03:58:24Z" level=info msg="[Applyinator] No image provided, creating empty working directory /var/lib/rancher/agent/work/20230518-035824/7524ce40115e7a1a9eab054d6bdfbb5ec7b2d37e242a42a6358037f110a6a3a7_0"
Broadcast message from root@harv-new on pts/0 (Thu 2023-05-18 03:59:04 UTC):
The system is going down for poweroff NOW!
Session terminated, killing shell... ...killed.
Terminated
package/harvester-os/files/etc/systemd/system/rke2-shutdown.service
Outdated
Show resolved
Hide resolved
6627884
to
2d56fc7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Problem:
harvester/harvester#1876
Solution:
Create a rke2-shutdown.service to run rke2-kill-containers.sh to stop all the running pods (containers) before the system shutdown or reboots.
The script executes the systemd stop(--kill-who=all) container unit in parallel. Systemd sends a SIGTERM signal to the process first, and if there is no response within the timeout, it will send a SIGKILL signal to the process.
Through testing, there has been a significant improvement in ext4 errors during container unmounting, but some ext4 errors still occur, which are not fundamentally related to the container program. The following is the ext4 log recorded during the reboot:
Related Issue:
harvester/harvester#1876
rancher/rke2#2411 (comment)
Test plan: