Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
bpf: disable broken write protection on i386
Since d2852a2 ("arch: add ARCH_HAS_SET_MEMORY config") and 9d876e7 ("bpf: fix unlocking of jited image when module ronx not set") that uses the former, Fengguang reported random corruptions on his i386 test machine [1]. On i386 there is no JIT available, and since his kernel config doesn't have kernel modules enabled, there was also no DEBUG_SET_MODULE_RONX enabled before which would set interpreted bpf_prog image as read-only like we do in various other cases for quite some time now, e.g. x86_64, arm64, etc. Thus, the difference with above commits was that we now used set_memory_ro() and set_memory_rw() on i386, which resulted in these issues. When reproducing this with Fengguang's config and qemu image, I changed lib/test_bpf.c to be run during boot instead of relying on trinity to fiddle with cBPF. The issues I saw with the BPF test suite when set_memory_ro() and set_memory_rw() is used to write protect image on i386 is that after a number of tests I noticed a corruption happening in bpf_prog_realloc(). Specifically, fp_old's content gets corrupted right *after* the (unrelated) __vmalloc() call and contains only zeroes right after the call instead of the original prog data. fp_old should have been freed later on via __bpf_prog_free() *after* we copied all the data over to the newly allocated fp. Result looks like: [...] [ 13.107240] test_bpf: torvalds#249 JMP_JSET_X: if (0x3 & 0x2) return 1 jited:0 17 PASS [ 13.108182] test_bpf: torvalds#250 JMP_JSET_X: if (0x3 & 0xffffffff) return 1 jited:0 17 PASS [ 13.109206] test_bpf: torvalds#251 JMP_JA: Jump, gap, jump, ... jited:0 16 PASS [ 13.110493] test_bpf: torvalds#252 BPF_MAXINSNS: Maximum possible literals jited:0 12 PASS [ 13.111885] test_bpf: torvalds#253 BPF_MAXINSNS: Single literal jited:0 8 PASS [ 13.112804] test_bpf: torvalds#254 BPF_MAXINSNS: Run/add until end jited:0 6341 PASS [ 13.177195] test_bpf: torvalds#255 BPF_MAXINSNS: Too many instructions PASS [ 13.177689] test_bpf: torvalds#256 BPF_MAXINSNS: Very long jump jited:0 9 PASS [ 13.178611] test_bpf: torvalds#257 BPF_MAXINSNS: Ctx heavy transformations [ 13.178713] BUG: unable to handle kernel NULL pointer dereference at 00000034 [ 13.179740] IP: bpf_prog_realloc+0x5b/0x90 [ 13.180017] *pde = 00000000 [ 13.180017] [ 13.180017] Oops: 0002 [#1] DEBUG_PAGEALLOC [ 13.180017] CPU: 0 PID: 1 Comm: swapper Not tainted 4.10.0-57268-gd627975-dirty torvalds#50 [ 13.180017] task: 401ec000 task.stack: 401f2000 [ 13.180017] EIP: bpf_prog_realloc+0x5b/0x90 [ 13.180017] EFLAGS: 00210246 CPU: 0 [ 13.180017] EAX: 00000000 EBX: 57ae1000 ECX: 00000000 EDX: 57ae1000 [ 13.180017] ESI: 00000019 EDI: 57b07000 EBP: 401f3e74 ESP: 401f3e68 [ 13.180017] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 13.180017] CR0: 80050033 CR2: 00000034 CR3: 12cb1000 CR4: 00000610 [ 13.180017] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 13.180017] DR6: fffe0ff0 DR7: 00000400 [ 13.180017] Call Trace: [ 13.180017] bpf_prepare_filter+0x317/0x3a0 [ 13.180017] bpf_prog_create+0x65/0xa0 [ 13.180017] test_bpf_init+0x1ca/0x628 [ 13.180017] ? test_hexdump_init+0xb5/0xb5 [ 13.180017] do_one_initcall+0x7c/0x11c [...] When using trinity from Fengguang's reproducer, the corruptions were at inconsistent places, presumably from code dealing with allocations and seeing similar effects as mentioned above. Not using set_memory_ro() and set_memory_rw() lets the test suite run just fine as expected, thus it looks like using set_memory_*() on i386 seems broken and mentioned commits just uncovered it. Also, for checking, I enabled DEBUG_RODATA_TEST for that kernel. Latter shows that memory protecting the kernel seems not working either on i386 (!). Test suite output: [...] [ 12.692836] Write protecting the kernel text: 13416k [ 12.693309] Write protecting the kernel read-only data: 5292k [ 12.693802] rodata_test: test data was not read only [...] Work-around to not enable ARCH_HAS_SET_MEMORY for i386 is not optimal as it doesn't fix the issue in presumably broken set_memory_*(), but it at least avoids people avoid having to deal with random corruptions that are hard to track down for the time being until a real fix can be found. [1] https://lkml.org/lkml/2017/3/2/648 Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Alexei Starovoitov <ast@kernel.org>
- Loading branch information