You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
virtio-net: fix a rtnl_lock() deadlock during probing
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
the virtio-net driver is still probing with rtnl_lock() hold, this will
cause a recursive mutex in netdev_notify_peers().
Fix it by temporarily save the announce status while probing, and then in
virtnet_open(), if it sees a delayed announce work is there, it starts to
schedule the virtnet_config_changed_work().
Another possible solution is to directly check whether rtnl_is_locked()
and call __netdev_notify_peers(), but in that way means we need to relies
on netdev_queue to schedule the arp packets after ndo_open(), which we
thought is not very intuitive.
We've observed a softlockup with Ubuntu 24.04, and can be reproduced with
QEMU sending the announce_self rapidly while booting.
[ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds.
[ 494.167667] Not tainted 6.8.0-57-generic torvalds#59-Ubuntu
[ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000
[ 494.168260] Call Trace:
[ 494.168329] <TASK>
[ 494.168389] __schedule+0x27c/0x6b0
[ 494.168495] schedule+0x33/0x110
[ 494.168585] schedule_preempt_disabled+0x15/0x30
[ 494.168709] __mutex_lock.constprop.0+0x42f/0x740
[ 494.168835] __mutex_lock_slowpath+0x13/0x20
[ 494.168949] mutex_lock+0x3c/0x50
[ 494.169039] rtnl_lock+0x15/0x20
[ 494.169128] netdev_notify_peers+0x12/0x30
[ 494.169240] virtnet_config_changed_work+0x152/0x1a0
[ 494.169377] virtnet_probe+0xa48/0xe00
[ 494.169484] ? vp_get+0x4d/0x100
[ 494.169574] virtio_dev_probe+0x1e9/0x310
[ 494.169682] really_probe+0x1c7/0x410
[ 494.169783] __driver_probe_device+0x8c/0x180
[ 494.169901] driver_probe_device+0x24/0xd0
[ 494.170011] __driver_attach+0x10b/0x210
[ 494.170117] ? __pfx___driver_attach+0x10/0x10
[ 494.170237] bus_for_each_dev+0x8d/0xf0
[ 494.170341] driver_attach+0x1e/0x30
[ 494.170440] bus_add_driver+0x14e/0x290
[ 494.170548] driver_register+0x5e/0x130
[ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10
[ 494.170788] register_virtio_driver+0x20/0x40
[ 494.170905] virtio_net_driver_init+0x97/0xb0
[ 494.171022] do_one_initcall+0x5e/0x340
[ 494.171128] do_initcalls+0x107/0x230
[ 494.171228] ? __pfx_kernel_init+0x10/0x10
[ 494.171340] kernel_init_freeable+0x134/0x210
[ 494.171462] kernel_init+0x1b/0x200
[ 494.171560] ret_from_fork+0x47/0x70
[ 494.171659] ? __pfx_kernel_init+0x10/0x10
[ 494.171769] ret_from_fork_asm+0x1b/0x30
[ 494.171875] </TASK>
Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down")
Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
Signed-off-by: NipaLocal <nipa@local>
0 commit comments