-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add journalctl logs to failed systemctl commands #17659
Conversation
/ok-to-test |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: medyagh, spowelljr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
kvm2 driver with docker runtime
Times for minikube start: 50.9s 49.7s 50.9s 48.3s 54.2s Times for minikube ingress: 26.7s 28.2s 28.2s 25.1s 27.2s docker driver with docker runtime
Times for minikube ingress: 20.8s 20.8s 18.9s 22.8s 20.8s Times for minikube start: 24.2s 22.3s 21.9s 24.6s 22.0s docker driver with containerd runtime
Times for minikube ingress: 19.4s 19.3s 31.3s 31.3s 31.4s Times for minikube start: 24.6s 23.6s 23.5s 21.4s 23.4s |
These are the flake rates of all failed tests.
To see the flake rates of all tests by environment, click here. |
@spowelljr i think this is a good call - at lest until we get to the bottom of these flakes |
Problem
Users (and our testing infra) occasionally run into flakey systemctl related failures when docker and cri-docker are being restarted. The error from systemctl doesn't include any useful information and outputs to run
journalctl -xeu <service>
to get the error logs. In our testing infra this is impossible as the clusters and already deleted when the logs complete, and due to the flakiness it's hard to reproduce a failure and users aren't able to generate use the logs. We don't know the cause of these errors so we can't further debug.Example:
Solution
I've wrapped the major systemctl commands with a function that checks if the output was successful. If it does there are no changes, but if the systemctl command did fail we call
sudo journalctl --no-pager -u <service>
to get the logs and then append them to the error so no extra work is needed to get the systemctl error logs.