-
Notifications
You must be signed in to change notification settings - Fork 151
bpf: Support multi-attach for freplace programs #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Master branch: d317b0a |
In preparation for moving code around, change a bunch of references to env->log (and the verbose() logging helper) to use bpf_log() and a direct pointer to struct bpf_verifier_log. While we're touching the function signature, mark the 'prog' argument to bpf_check_type_match() as const. Also enhance the bpf_verifier_log_needed() check to handle NULL pointers for the log struct so we can re-use the code with logging disabled. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- include/linux/bpf.h | 2 +- include/linux/bpf_verifier.h | 5 +++- kernel/bpf/btf.c | 6 +++-- kernel/bpf/verifier.c | 48 +++++++++++++++++++++--------------------- 4 files changed, 31 insertions(+), 30 deletions(-)
The check_attach_btf_id() function really does three things: 1. It performs a bunch of checks on the program to ensure that the attachment is valid. 2. It stores a bunch of state about the attachment being requested in the verifier environment and struct bpf_prog objects. 3. It allocates a trampoline for the attachment. This patch splits out (1.) and (3.) into separate functions in preparation for reusing them when the actual attachment is happening (in the raw_tracepoint_open syscall operation), which will allow tracing programs to have multiple (compatible) attachments. No functional change is intended with this patch. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- include/linux/bpf.h | 7 + include/linux/bpf_verifier.h | 9 ++ kernel/bpf/trampoline.c | 20 ++++ kernel/bpf/verifier.c | 197 ++++++++++++++++++++++++------------------ 4 files changed, 149 insertions(+), 84 deletions(-)
In preparation for allowing multiple attachments of freplace programs, move the references to the target program and trampoline into the bpf_tracing_link structure when that is created. To do this atomically, introduce a new mutex in prog->aux to protect writing to the two pointers to target prog and trampoline, and rename the members to make it clear that they are related. With this change, it is no longer possible to attach the same tracing program multiple times (detaching in-between), since the reference from the tracing program to the target disappears on the first attach. However, since the next patch will let the caller supply an attach target, that will also make it possible to attach to the same place multiple times. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- include/linux/bpf.h | 15 +++++++++------ kernel/bpf/btf.c | 6 +++--- kernel/bpf/core.c | 9 ++++++--- kernel/bpf/syscall.c | 46 ++++++++++++++++++++++++++++++++++++++++------ kernel/bpf/trampoline.c | 12 ++++-------- kernel/bpf/verifier.c | 9 +++++---- 6 files changed, 67 insertions(+), 30 deletions(-)
This enables support for attaching freplace programs to multiple attach points. It does this by amending the UAPI for bpf_link_Create with a target btf ID that can be used to supply the new attachment point along with the target program fd. The target must be compatible with the target that was supplied at program load time. The implementation reuses the checks that were factored out of check_attach_btf_id() to ensure compatibility between the BTF types of the old and new attachment. If these match, a new bpf_tracing_link will be created for the new attach target, allowing multiple attachments to co-exist simultaneously. The code could theoretically support multiple-attach of other types of tracing programs as well, but since I don't have a use case for any of those, there is no API support for doing so. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- include/linux/bpf.h | 2 + include/uapi/linux/bpf.h | 2 + kernel/bpf/syscall.c | 92 ++++++++++++++++++++++++++++++++++------ kernel/bpf/verifier.c | 9 ++++ tools/include/uapi/linux/bpf.h | 2 + 5 files changed, 94 insertions(+), 13 deletions(-)
Eelco reported we can't properly access arguments if the tracing
program is attached to extension program.
Having following program:
SEC("classifier/test_pkt_md_access")
int test_pkt_md_access(struct __sk_buff *skb)
with its extension:
SEC("freplace/test_pkt_md_access")
int test_pkt_md_access_new(struct __sk_buff *skb)
and tracing that extension with:
SEC("fentry/test_pkt_md_access_new")
int BPF_PROG(fentry, struct sk_buff *skb)
It's not possible to access skb argument in the fentry program,
with following error from verifier:
; int BPF_PROG(fentry, struct sk_buff *skb)
0: (79) r1 = *(u64 *)(r1 +0)
invalid bpf_context access off=0 size=8
The problem is that btf_ctx_access gets the context type for the
traced program, which is in this case the extension.
But when we trace extension program, we want to get the context
type of the program that the extension is attached to, so we can
access the argument properly in the trace program.
This version of the patch is tweaked slightly from Jiri's original one,
since the refactoring in the previous patches means we have to get the
target prog type from the new variable in prog->aux instead of directly
from the target prog.
Reported-by: Eelco Chaudron <echaudro@redhat.com>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
kernel/bpf/btf.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
This adds support for supplying a target btf ID for the bpf_link_create() operation, and adds a new bpf_program__attach_freplace() high-level API for attaching freplace functions with a target. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- tools/lib/bpf/bpf.c | 1 + tools/lib/bpf/bpf.h | 3 ++- tools/lib/bpf/libbpf.c | 24 ++++++++++++++++++------ tools/lib/bpf/libbpf.h | 3 +++ tools/lib/bpf/libbpf.map | 1 + 5 files changed, 25 insertions(+), 7 deletions(-)
This adds a selftest for attaching an freplace program to multiple targets simultaneously. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- .../selftests/bpf/prog_tests/fexit_bpf2bpf.c | 171 ++++++++++++++++---- .../selftests/bpf/progs/freplace_get_constant.c | 15 ++ 2 files changed, 154 insertions(+), 32 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/freplace_get_constant.c
Adding test that setup following program:
SEC("classifier/test_pkt_md_access")
int test_pkt_md_access(struct __sk_buff *skb)
with its extension:
SEC("freplace/test_pkt_md_access")
int test_pkt_md_access_new(struct __sk_buff *skb)
and tracing that extension with:
SEC("fentry/test_pkt_md_access_new")
int BPF_PROG(fentry, struct sk_buff *skb)
The test verifies that the tracing program can
dereference skb argument properly.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
tools/testing/selftests/bpf/prog_tests/trace_ext.c | 113 ++++++++++++++++++++
tools/testing/selftests/bpf/progs/test_trace_ext.c | 18 +++
.../selftests/bpf/progs/test_trace_ext_tracing.c | 25 ++++
3 files changed, 156 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_ext.c
create mode 100644 tools/testing/selftests/bpf/progs/test_trace_ext.c
create mode 100644 tools/testing/selftests/bpf/progs/test_trace_ext_tracing.c
f94162c to
a20fae3
Compare
|
Master branch: c64779e patch https://patchwork.ozlabs.org/project/netdev/patch/160017005805.98230.16775816864129373411.stgit@toke.dk/ applied successfully |
|
At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=201797 expired. Closing PR. |
The histogram logic was allowing events with char * pointers to be used as normal strings. But it was easy to crash the kernel with: # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger And open some files, and boom! BUG: unable to handle page fault for address: 00007f2ced0c3280 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280 FS: 00007f2ced0f8580(0000) GS:ffff93825a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0 Call Trace: event_hist_trigger+0x463/0x5f0 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? kernel_init_free_pages+0x6d/0x90 ? preempt_count_sub+0x9b/0xd0 ? kernel_init_free_pages+0x6d/0x90 ? get_page_from_freelist+0x12c4/0x1680 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 ftrace_syscall_enter+0x264/0x2c0 syscall_trace_enter.constprop.0+0x1ee/0x210 do_syscall_64+0x1c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Where it triggered a fault on strlen(key) where key was the filename. The reason is that filename is a char * to user space, and the histogram code just blindly dereferenced it, with obvious bad results. I originally tried to use strncpy_from_user/kernel_nofault() but found that there's other places that its dereferenced and not worth the effort. Just do not allow "char *" to act like strings. Link: https://lkml.kernel.org/r/20210715000206.025df9d2@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Tom Zanussi <zanussi@kernel.org> Fixes: 79e577c ("tracing: Support string type key properly") Fixes: 5967bd5 ("tracing: Let filter_assign_type() detect FILTER_PTR_STRING") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP
and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event.
Command to trigger the warning:
# perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5
Performance counter stats for 'sleep 5':
0 thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/
5.002117947 seconds time elapsed
0.000131000 seconds user
0.001063000 seconds sys
Below is snippet of the warning in dmesg:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec
preempt_count: 2, expected: 0
4 locks held by perf-exec/2869:
#0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90
#1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0
#2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510
#3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510
irq event stamp: 4806
hardirqs last enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0
hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510
softirqs last enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61
Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
Call Trace:
dump_stack_lvl+0x98/0xe0 (unreliable)
__might_resched+0x2f8/0x310
__mutex_lock+0x6c/0x13f0
thread_imc_event_add+0xf4/0x1b0
event_sched_in+0xe0/0x210
merge_sched_in+0x1f0/0x600
visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0
ctx_flexible_sched_in+0xcc/0x140
ctx_sched_in+0x20c/0x2a0
ctx_resched+0x104/0x1c0
perf_event_exec+0x340/0x510
begin_new_exec+0x730/0xef0
load_elf_binary+0x3f8/0x1e10
...
do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0
WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0
CPU: 36 PID: 2869 Comm: sleep Tainted: G W 6.2.0-rc2-00011-g1247637727f2 #61
Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
NIP: c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670
REGS: c00000004d2134e0 TRAP: 0700 Tainted: G W (6.2.0-rc2-00011-g1247637727f2)
MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 48002824 XER: 00000000
CFAR: c00000000013fb64 IRQMASK: 1
The above warning triggered because the current imc-pmu code uses mutex
lock in interrupt disabled sections. The function mutex_lock()
internally calls __might_resched(), which will check if IRQs are
disabled and in case IRQs are disabled, it will trigger the warning.
Fix the issue by changing the mutex lock to spinlock.
Fixes: 8f95faa ("powerpc/powernv: Detect and create IMC device")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
[mpe: Fix comments, trim oops in change log, add reported-by tags]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230106065157.182648-1-kjain@linux.ibm.com
Currently, test_progs outputs all stdout/stderr as it runs, and when it
is done, prints a summary.
It is non-trivial for tooling to parse that output and extract meaningful
information from it.
This change adds a new option, `--json-summary`/`-J` that let the caller
specify a file where `test_progs{,-no_alu32}` can write a summary of the
run in a json format that can later be parsed by tooling.
Currently, it creates a summary section with successes/skipped/failures
followed by a list of failed tests and subtests.
A test contains the following fields:
- name: the name of the test
- number: the number of the test
- message: the log message that was printed by the test.
- failed: A boolean indicating whether the test failed or not. Currently
we only output failed tests, but in the future, successful tests could
be added.
- subtests: A list of subtests associated with this test.
A subtest contains the following fields:
- name: same as above
- number: sanme as above
- message: the log message that was printed by the subtest.
- failed: same as above but for the subtest
An example run and json content below:
```
$ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print
$1","}' | tr -d '\n') -j -J /tmp/test_progs.json
$ jq < /tmp/test_progs.json | head -n 30
{
"success": 29,
"success_subtest": 23,
"skipped": 3,
"failed": 28,
"results": [
{
"name": "bpf_cookie",
"number": 10,
"message": "test_bpf_cookie:PASS:skel_open 0 nsec\n",
"failed": true,
"subtests": [
{
"name": "multi_kprobe_link_api",
"number": 2,
"message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0
nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not
resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed
to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
"failed": true
},
{
"name": "multi_kprobe_attach_api",
"number": 3,
"message": "libbpf: extern 'bpf_testmod_fentry_test1'
(strong): not resolved\nlibbpf: failed to load object
'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
"failed": true
},
{
"name": "lsm",
"number": 8,
"message": "lsm_subtest:PASS:lsm.link_create 0
nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual
0 != expected -1\n",
"failed": true
}
```
The file can then be used to print a summary of the test run and list of
failing tests/subtests:
```
$ jq -r < /tmp/test_progs.json '"Success:
\(.success)/\(.success_subtest), Skipped: \(.skipped), Failed:
\(.failed)"'
Success: 29/23, Skipped: 3, Failed: 28
$ jq -r < /tmp/test_progs.json '.results | map([
if .failed then "#\(.number) \(.name)" else empty end,
(
. as {name: $tname, number: $tnum} | .subtests | map(
if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)"
else empty end
)
)
]) | flatten | .[]' | head -n 20
#10 bpf_cookie
#10/2 bpf_cookie/multi_kprobe_link_api
#10/3 bpf_cookie/multi_kprobe_attach_api
#10/8 bpf_cookie/lsm
#15 bpf_mod_race
#15/1 bpf_mod_race/ksym (used_btfs UAF)
#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF)
#36 cgroup_hierarchical_stats
#61 deny_namespace
#61/1 deny_namespace/unpriv_userns_create_no_bpf
#73 fexit_stress
#83 get_func_ip_test
#99 kfunc_dynptr_param
#99/1 kfunc_dynptr_param/dynptr_data_null
#99/4 kfunc_dynptr_param/dynptr_data_null
#100 kprobe_multi_bench_attach
#100/1 kprobe_multi_bench_attach/kernel
#100/2 kprobe_multi_bench_attach/modules
#101 kprobe_multi_test
#101/1 kprobe_multi_test/skel_api
```
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Currently, test_progs outputs all stdout/stderr as it runs, and when it
is done, prints a summary.
It is non-trivial for tooling to parse that output and extract meaningful
information from it.
This change adds a new option, `--json-summary`/`-J` that let the caller
specify a file where `test_progs{,-no_alu32}` can write a summary of the
run in a json format that can later be parsed by tooling.
Currently, it creates a summary section with successes/skipped/failures
followed by a list of failed tests and subtests.
A test contains the following fields:
- name: the name of the test
- number: the number of the test
- message: the log message that was printed by the test.
- failed: A boolean indicating whether the test failed or not. Currently
we only output failed tests, but in the future, successful tests could
be added.
- subtests: A list of subtests associated with this test.
A subtest contains the following fields:
- name: same as above
- number: sanme as above
- message: the log message that was printed by the subtest.
- failed: same as above but for the subtest
An example run and json content below:
```
$ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print
$1","}' | tr -d '\n') -j -J /tmp/test_progs.json
$ jq < /tmp/test_progs.json | head -n 30
{
"success": 29,
"success_subtest": 23,
"skipped": 3,
"failed": 28,
"results": [
{
"name": "bpf_cookie",
"number": 10,
"message": "test_bpf_cookie:PASS:skel_open 0 nsec\n",
"failed": true,
"subtests": [
{
"name": "multi_kprobe_link_api",
"number": 2,
"message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0
nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not
resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed
to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
"failed": true
},
{
"name": "multi_kprobe_attach_api",
"number": 3,
"message": "libbpf: extern 'bpf_testmod_fentry_test1'
(strong): not resolved\nlibbpf: failed to load object
'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi':
-3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected
error: -3\n",
"failed": true
},
{
"name": "lsm",
"number": 8,
"message": "lsm_subtest:PASS:lsm.link_create 0
nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual
0 != expected -1\n",
"failed": true
}
```
The file can then be used to print a summary of the test run and list of
failing tests/subtests:
```
$ jq -r < /tmp/test_progs.json '"Success:
\(.success)/\(.success_subtest), Skipped: \(.skipped), Failed:
\(.failed)"'
Success: 29/23, Skipped: 3, Failed: 28
$ jq -r < /tmp/test_progs.json '.results | map([
if .failed then "#\(.number) \(.name)" else empty end,
(
. as {name: $tname, number: $tnum} | .subtests | map(
if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)"
else empty end
)
)
]) | flatten | .[]' | head -n 20
#10 bpf_cookie
#10/2 bpf_cookie/multi_kprobe_link_api
#10/3 bpf_cookie/multi_kprobe_attach_api
#10/8 bpf_cookie/lsm
#15 bpf_mod_race
#15/1 bpf_mod_race/ksym (used_btfs UAF)
#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF)
#36 cgroup_hierarchical_stats
#61 deny_namespace
#61/1 deny_namespace/unpriv_userns_create_no_bpf
#73 fexit_stress
#83 get_func_ip_test
#99 kfunc_dynptr_param
#99/1 kfunc_dynptr_param/dynptr_data_null
#99/4 kfunc_dynptr_param/dynptr_data_null
#100 kprobe_multi_bench_attach
#100/1 kprobe_multi_bench_attach/kernel
#100/2 kprobe_multi_bench_attach/modules
#101 kprobe_multi_test
#101/1 kprobe_multi_test/skel_api
```
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Currently, test_progs outputs all stdout/stderr as it runs, and when it
is done, prints a summary.
It is non-trivial for tooling to parse that output and extract meaningful
information from it.
This change adds a new option, `--json-summary`/`-J` that let the caller
specify a file where `test_progs{,-no_alu32}` can write a summary of the
run in a json format that can later be parsed by tooling.
Currently, it creates a summary section with successes/skipped/failures
followed by a list of failed tests and subtests.
A test contains the following fields:
- name: the name of the test
- number: the number of the test
- message: the log message that was printed by the test.
- failed: A boolean indicating whether the test failed or not. Currently
we only output failed tests, but in the future, successful tests could
be added.
- subtests: A list of subtests associated with this test.
A subtest contains the following fields:
- name: same as above
- number: sanme as above
- message: the log message that was printed by the subtest.
- failed: same as above but for the subtest
An example run and json content below:
```
$ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print
$1","}' | tr -d '\n') -j -J /tmp/test_progs.json
$ jq < /tmp/test_progs.json | head -n 30
{
"success": 29,
"success_subtest": 23,
"skipped": 3,
"failed": 28,
"results": [
{
"name": "bpf_cookie",
"number": 10,
"message": "test_bpf_cookie:PASS:skel_open 0 nsec\n",
"failed": true,
"subtests": [
{
"name": "multi_kprobe_link_api",
"number": 2,
"message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0 nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n",
"failed": true
},
{
"name": "multi_kprobe_attach_api",
"number": 3,
"message": "libbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n",
"failed": true
},
{
"name": "lsm",
"number": 8,
"message": "lsm_subtest:PASS:lsm.link_create 0 nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual 0 != expected -1\n",
"failed": true
}
```
The file can then be used to print a summary of the test run and list of
failing tests/subtests:
```
$ jq -r < /tmp/test_progs.json '"Success: \(.success)/\(.success_subtest), Skipped: \(.skipped), Failed: \(.failed)"'
Success: 29/23, Skipped: 3, Failed: 28
$ jq -r < /tmp/test_progs.json '.results | map([
if .failed then "#\(.number) \(.name)" else empty end,
(
. as {name: $tname, number: $tnum} | .subtests | map(
if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)" else empty end
)
)
]) | flatten | .[]' | head -n 20
#10 bpf_cookie
#10/2 bpf_cookie/multi_kprobe_link_api
#10/3 bpf_cookie/multi_kprobe_attach_api
#10/8 bpf_cookie/lsm
#15 bpf_mod_race
#15/1 bpf_mod_race/ksym (used_btfs UAF)
#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF)
#36 cgroup_hierarchical_stats
#61 deny_namespace
#61/1 deny_namespace/unpriv_userns_create_no_bpf
#73 fexit_stress
#83 get_func_ip_test
#99 kfunc_dynptr_param
#99/1 kfunc_dynptr_param/dynptr_data_null
#99/4 kfunc_dynptr_param/dynptr_data_null
#100 kprobe_multi_bench_attach
#100/1 kprobe_multi_bench_attach/kernel
#100/2 kprobe_multi_bench_attach/modules
#101 kprobe_multi_test
#101/1 kprobe_multi_test/skel_api
```
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230317163256.3809328-1-chantr4@gmail.com
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Fix it by correctly modifying vid0 refcnt of real_dev when changing the NETIF_F_HW_VLAN_CTAG_FILTER flag during runtime. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Fix it by correctly modifying vid0 refcnt of real_dev when changing the NETIF_F_HW_VLAN_CTAG_FILTER flag during runtime. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Fix it by correctly modifying vid0 refcnt of real_dev when changing the NETIF_F_HW_VLAN_CTAG_FILTER flag during runtime. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime 8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: NipaLocal <nipa@local>
… runtime
Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.
The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:
# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1
When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].
Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //vlan_info is null
Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.
[1]
unreferenced object 0xffff8880068e3100 (size 256):
comm "ip", pid 384, jiffies 4296130254
hex dump (first 32 bytes):
00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 81ce31fa):
__kmalloc_cache_noprof+0x2b5/0x340
vlan_vid_add+0x434/0x940
vlan_device_event.cold+0x75/0xa8
notifier_call_chain+0xca/0x150
__dev_notify_flags+0xe3/0x250
rtnl_configure_link+0x193/0x260
rtnl_newlink_create+0x383/0x8e0
__rtnl_newlink+0x22c/0xa40
rtnl_newlink+0x627/0xb00
rtnetlink_rcv_msg+0x6fb/0xb70
netlink_rcv_skb+0x11f/0x350
netlink_unicast+0x426/0x710
netlink_sendmsg+0x75a/0xc20
__sock_sendmsg+0xc1/0x150
____sys_sendmsg+0x5aa/0x7b0
___sys_sendmsg+0xfc/0x180
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [kernel-patches#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 kernel-patches#61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Fixes: ad1afb0 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull request for series with
subject: bpf: Support multi-attach for freplace programs
version: 5
url: https://patchwork.ozlabs.org/project/netdev/list/?series=201797