-
Notifications
You must be signed in to change notification settings - Fork 150
selftests/bpf: merge most of test_btf into test_progs #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Master branch: bf74a37 patch https://patchwork.ozlabs.org/project/netdev/patch/20200915014341.2949692-1-andriin@fb.com/ applied successfully |
…xercised
regularly. Pretty-printing tests were left alone and renamed into
test_btf_pprint because they are very slow and were not even executed by
default with test_btf.
All the test_btf tests that were moved are modeled as proper sub-tests in
test_progs framework for ease of debugging and reporting.
No functional or behavioral changes were intended, I tried to preserve
original behavior as close to the original as possible. `test_progs -v` will
activate "always_log" flag to emit BTF validation log.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
v1->v2:
- pretty-print BTF tests were renamed test_btf -> test_btf_pprint, which
allowed GIT to detect that majority of test_btf code was moved into
prog_tests/btf.c; so diff is much-much smaller;
tools/testing/selftests/bpf/.gitignore | 2 +-
.../bpf/{test_btf.c => prog_tests/btf.c} | 1069 +----------------
tools/testing/selftests/bpf/test_btf_pprint.c | 969 +++++++++++++++
3 files changed, 1033 insertions(+), 1007 deletions(-)
rename tools/testing/selftests/bpf/{test_btf.c => prog_tests/btf.c} (85%)
create mode 100644 tools/testing/selftests/bpf/test_btf_pprint.c
|
Master branch: d317b0a patch https://patchwork.ozlabs.org/project/netdev/patch/20200915014341.2949692-1-andriin@fb.com/ applied successfully |
49583d8 to
6be61d6
Compare
In case of memory pressure the MPTCP xmit path keeps at most a single skb in the tx cache, eventually freeing additional ones. The associated counter for forward memory is not update accordingly, and that causes the following splat: WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0 RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003 RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7 R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100 R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85 FS: 0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0 Call Trace: __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272 process_one_work+0x896/0x1170 kernel/workqueue.c:2275 worker_thread+0x605/0x1350 kernel/workqueue.c:2421 kthread+0x344/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 At close time, as reported by syzkaller/Christoph. This change address the issue properly updating the fwd allocated memory counter in the error path. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: multipath-tcp/mptcp_net-next#136 Fixes: 724cfd2 ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The rtla osnoise tool is an interface for the osnoise tracer. The
osnoise tracer dispatches a kernel thread per-cpu. These threads read
the time in a loop while with preemption, softirqs and IRQs enabled,
thus allowing all the sources of osnoise during its execution. The
osnoise threads take note of the entry and exit point of any source
of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source
of interference.
The rtla osnoise top mode displays information about the periodic
summary from the osnoise tracer.
One example of rtla osnoise top output is:
[root@alien ~]# rtla osnoise top -c 0-3 -d 1m -q -r 900000 -P F:1
Operating System Noise
duration: 0 00:01:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI IRQ Softirq Thread
0 #58 52200000 1031 99.99802 91 60 0 0 52285 0 101
1 #59 53100000 5 99.99999 5 5 0 9 53122 0 18
2 #59 53100000 7 99.99998 7 7 0 8 53115 0 18
3 #59 53100000 8274 99.98441 277 23 0 9 53778 0 660
"rtla osnoise top --help" works and provide information about the
available options.
Link: https://lkml.kernel.org/r/0d796993abf587ae5a170bb8415c49368d4999e1.1639158831.git.bristot@kernel.org
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: linux-rt-users@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #18 bpf_tcp_ca:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK #233 xdp_bpf2bpf:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>
This will allow us to use common functions soon. Note this generates the following warnings from scripts/checkpatch.pl --quiet: WARNING: quoted string split across lines #59: FILE: fs/smb/client/cifs_debug.c:481: + seq_printf(m, "\nDebug count_get_receive_buffer: %llu " + "count_put_receive_buffer: %llu count_send_empty: %llu", WARNING: quoted string split across lines #66: FILE: fs/smb/client/cifs_debug.c:486: seq_printf(m, "\nRead Queue " + "count_enqueue_reassembly_queue: %llu " WARNING: quoted string split across lines #67: FILE: fs/smb/client/cifs_debug.c:487: + "count_enqueue_reassembly_queue: %llu " + "count_dequeue_reassembly_queue: %llu " total: 0 errors, 3 warnings, 83 lines checked scripts/checkpatch.pl: FAILED But I left them in there, because it matches the code arround it... Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Pull request for series with
subject: selftests/bpf: merge most of test_btf into test_progs
version: 2
url: https://patchwork.ozlabs.org/project/netdev/list/?series=201720