Skip to content

Conversation

@liulinC
Copy link
Collaborator

@liulinC liulinC commented Sep 27, 2024

to connect to nbd devices, nbd_client_manager will

  1. protect the operation with /var/run/nonpersistent/nbd_client_manager file lock
  2. check whether nbd is being used by nbd-client -check
  3. load nbd kernel module by modprobe nbd
  4. call nbd-client to connect to nbd device

However, step 3 will trigger systemd-udevd run asyncly, which would open and lock the same nbd devices, run udev rules, etc. This introduce races with step 4, e.g. both process want to open and lock the nbd device.

Note: the file lock in step 1 does NOT resovle the issue here, as it only coordinate multiple nbd_client_manager processes.

To fix the issue,

  • we patch nbd-client to report the device busy from kernel to nbd_client_manager
  • nbd_client_manager should check nbd-client exit code, and retry on device busy

@MarkSymsCtx
Copy link
Contributor

However, step 3 will trigger systemd-udevd run asyncly, which would open and lock the same nbd devices, run udev rules, etc. This introduce races with step 4, e.g. both process want to open and lock the nbd device.

This would be better fixed by adding a rule to udev to ensure that system-udevd leaves the devices alone and thus doesn't result in them being found busy.

@liulinC
Copy link
Collaborator Author

liulinC commented Sep 27, 2024

However, step 3 will trigger systemd-udevd run asyncly, which would open and lock the same nbd devices, run udev rules, etc. This introduce races with step 4, e.g. both process want to open and lock the nbd device.

This would be better fixed by adding a rule to udev to ensure that system-udevd leaves the devices alone and thus doesn't result in them being found busy.

XS8 stream udev handle the nbd device as well. (and has this race issue as well).
I have already raise some PR to clean some unnecessary udev rules.
I doubt should we totally ignore nbd devices in udev rules. (as XS8 steam do take care of it)

BTW, udev rules is only run and executed with the device been opened and lock been hold by systemd-udevd(workers),
https://github.com/systemd/systemd/blob/main/src/udev/udev-worker.c#L184-L203
This means the udev rules can only narrow down the race time window, but can not resolve the issue.

So I prefer to resolve the races issues other than just totally ignore nbd devices in the udev rules.

to connect to nbd devices, nbd_client_manager will
1. protect the operation with
/var/run/nonpersistent/nbd_client_manager file lock
2. check whether nbd is being used by `nbd-client -check`
3. load nbd kernel module by `modprobe nbd`
4. call `nbd-client` to connect to nbd device

However, step 3 will trigger systemd-udevd run asyncly, which would
open and lock the same nbd devices, run udev rules, etc. This introduce
races with step 4, e.g. both process want to open and lock the nbd
device.

Note: the file lock in step 1 does NOT resovle the issue here,
as it only coordinate multiple nbd_client_manager processes.

To fix the issue,
- we patch nbd-client to report the device busy from kernel to nbd_client_manager
- nbd_client_manager should check nbd-client exit code, and retry on device busy
- nbd_client_manager call `udevadm settle` to wait for udevd parsing udev rules

Note: checking nbd-client exit code is still necessary in case of racing with others

Signed-off-by: Lin Liu <Lin.Liu01@cloud.com>
@liulinC
Copy link
Collaborator Author

liulinC commented Oct 9, 2024

job: 4126316~4126325 looks good.

@liulinC liulinC requested a review from gangj October 10, 2024 01:40
@lindig
Copy link
Contributor

lindig commented Oct 14, 2024

Can we get reviews from @MarkSymsCtx and @robhoes ?

@liulinC liulinC added this pull request to the merge queue Oct 14, 2024
Merged via the queue into xapi-project:master with commit 34a8796 Oct 14, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants