-
Notifications
You must be signed in to change notification settings - Fork 2.2k
tests: bfq: skip tests on misbehaving udev systems #4825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
46218c8 to
357318f
Compare
|
This patch seems to work on x86_64 for Tumbleweed but on aarch64 I'm still seeing this: https://openqa.opensuse.org/tests/5209568/file/runc-runc-root.tap On SLES 16.0 I still see it on both arches. https://openqa.suse.de/tests/18613516/file/runc-runc-root.tap |
|
@ricardobranco777 Did you apply both patches? Unfortunately, there could be a race with udev where it sets the scheduler back after we checked it. Not sure if there is a better solution than modifying the host config, to be honest... On my Tumbleweed machine, I haven't managed to hit that race yet though... |
Yes.
Ok. I'll look into that instead. Thanks! |
|
Does it fail consistently even with the patches applied? The patch should just cause the problematic test to get skipped if udev is silently changing the scheduler... If you have actual access to the OpenQA box, there is a bpftrace script from the issue that will tell us who is changing the scheduler and when. I can check the qcows myself later if I have some time. |
No.
I'm applying the patch to SLES 15-SP4+ & Tumbleweed here: |
|
@ricardobranco777 Can you try it again with the |
If an error occurs during a test which sets up loopback devices, the loopback device is not freed. Since most systems have very conservative limits on the number of loopback devices, re-running a failing test locally to debug it often ends up erroring out due to loopback device exhaustion. So let's just move the "losetup -d" to teardown, where it belongs. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
openSUSE has an unfortunate default udev setup which forcefully sets all loop devices to use the "none" scheduler, even if you manually set it. As this is a property of the host configuration (and udev is monitoring from the host) we cannot really change this behaviour from inside our test container. So we should just skip the test in this (hopefully unusual) case. Ideally tools running the test suite should disable this behaviour when running our test suite. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Sure. I just cloned openQA jobs and the links are available in this PR description. |
It works. Now I can't unignore |
|
/ping @kolyshkin @ricardobranco777 says the |
|
Successfully tested on:
|
kolyshkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (nit: second commit description doesn't mention sleep)
rata
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this @kolyshkin @cyphar !
This LGTM, but left a comment that I think would be slightly better. If you don't agree, feel free to ignore it and merge :)
| # usually triggered by the "change" event from losetup, we can wait for a | ||
| # little bit before continuing the test. For more details, see | ||
| # <https://github.com/opencontainers/runc/issues/4781>. | ||
| sleep 2s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with this. I wonder if doing the sleep only if the distro is suse is better, though.
That way we don't affect any other platform (IIRC github actions is not running suse at all) and we do notice if any other platform has this behavior, and we can decide to skip it in that platform too.
|
Oh, auto-merge was enabled :-D |
|
Created #4838 |
openSUSE has an unfortunate default udev setup which forcefully sets all
loop devices to use the "none" scheduler, even if you manually set it.
As this is a property of the host configuration (and udev is monitoring
from the host) we cannot really change this behaviour from inside our
test container.
So we should just skip the test in this (hopefully unusual) case.
Ideally tools running the test suite should disable this behaviour when
running our test suite.
Fixes #4781
Signed-off-by: Aleksa Sarai cyphar@cyphar.com