-
Notifications
You must be signed in to change notification settings - Fork 198
systemd: wait for udev to settle #762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
From
This sounds a bit concerning to me. |
It does. Not sure if there's a better way to solve this. This certainly will not help on OCP where we'd have to do a similar thing, but for kubelet, because kubelet starts |
It depends whether there are other boot critical services waiting on TuneD. For kubelet the following may work (-t 60 : give up after 60 seconds):
|
True. booting fast and then running TuneD when (perhaps latency-critical) apps are already running might not help either. In the case of latency-critical apps quite the opposite.
That might be one of the options. Thinking OpenShift now, perhaps only do this in our @MarSik , thoughts? |
5e3014f
to
6a29c9d
Compare
This should help with races caused by udev renaming network devices. Signed-off-by: Jaroslav Škarvada <jskarvad@redhat.com>
6a29c9d
to
56bab42
Compare
@jmencak The one shot service is a prerequisite for kubelet anyway so it makes little difference. But of course the early tuned execution should already see the proper names. I am a bit worried what will happen on systems with remote storage though (= a lot of disks). |
I'd say the key is finding the "sweet" spot how long to wait before giving up and timing out in favour of proceeding. I.e. not blocking the kubelet in OCP and |
It isn't rocket science behind the The most clean aproach is to ignore the udev events until TuneD is fully initialized, in this way we could miss network adapters rename events (which it gets a lot during startup and I think it's because TuneD process is started in the wrong time when another process is renaming network adapters) and add events, so some newly added devices needn't be tuned. Even redesign wouldn't help much, because even with the one worker tuning thread, when it start processing add event, the device in question could be already added, renamed or removed several times and it could happen even during the time the device is being tuned, because applying multiple tunings to the device isn't atomic operation. This complicates the process a lot, because backend tools (like e.g. Nevertheless, we are adding patches improving the situation, but being able to postpone TuneD start after most of the existing network adapters are renamed would help a lot with possible future problems. |
Looking at the man page of
Agreed. |
On Fedora the default timeout is 120 s. It's the maximum number of seconds to wait if the queue still isn't emptied which IMHO on normal system the queue is emptied in cca. several seconds at max. So let's say the queue is emptied in cca. 2 seconds, it means 2 seconds boot delay and after the 2 seconds the If the queue isn't emptied in 120 s (the default setting on Fedora if the IMHO @zacikpa did some measurements of the boot delay on Fedora w/wo the |
@yarda TBH, I only tried to measure it on machines with very few devices (say, a laptop) and there was never any delay higher than normal boot time variance. |
The real test will be deployments with various network-attached storage devices. Then I suspect we'll hit cases where depolyments mostly benefit from this, but I'm sure there will be outliers where it is preferable to have partial tuning in place with a few misses. I guess we don't have a better solution right now and time will tell. What is good there at least seems to be a reasonable timeout. |
This should help with races caused by udev renaming network devices.