Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest devel build update +reboot crashed host #18

Closed
sathnaga opened this issue Oct 25, 2017 · 2 comments
Closed

Latest devel build update +reboot crashed host #18

sathnaga opened this issue Oct 25, 2017 · 2 comments

Comments

@sathnaga
Copy link
Member

sathnaga commented Oct 25, 2017

cde:info Mirrored with LTC bug https://bugzilla.linux.ibm.com/show_bug.cgi?id=160569 </cde:info>

Action: yum update + reboot
https://ltc-jenkins.aus.stglabs.ibm.com/job/HostOS_CI/842/consoleText

         Stopping Replay Read-Ahead Data...
[  OK  ] Reached target Shutdown.
[119099.239708] Unable to handle kernel paging request for data at address 0x00000010
[119099.239794] Faulting instruction address: 0xd00000000730064c
cpu 0x0: Vector: 300 (Data Access) at [c0000007f86077d0]
    pc: d00000000730064c: bm_evict_inode+0x2c/0x80 [binfmt_misc]
    lr: c00000000039003c: evict+0xfc/0x260
    sp: c0000007f8607a50
   msr: 900000010280b033
   dar: 10
 dsisr: 40000000
  current = 0xc0000007f8580080
  paca    = 0xc00000000fd60000   softe: 0        irq_happened: 0x01
    pid   = 1, comm = systemd
Linux version 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le (mockbuild@host-os-jenkins-slave03.aus.stglabs.ibm.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-17) (GCC)) #1 SMP Fri Oct 20 22:55:44 -02 2017
enter ? for help
[c0000007f8607a80] c00000000039003c evict+0xfc/0x260
[c0000007f8607ac0] c000000000389258 dentry_unlink_inode+0x148/0x1c0
[c0000007f8607af0] c00000000038ad58 __dentry_kill+0xe8/0x2a0
[c0000007f8607b30] c00000000038b634 shrink_dentry_list+0x1e4/0x4e0
[c0000007f8607ba0] c00000000038bb84 shrink_dcache_parent+0x54/0xb0
[c0000007f8607c00] c00000000038bc08 do_one_tree+0x28/0x60
[c0000007f8607c30] c00000000038ce4c shrink_dcache_for_umount+0x4c/0xc0
[c0000007f8607ca0] c00000000036a92c generic_shutdown_super+0x3c/0x190
[c0000007f8607d10] c00000000036af08 kill_litter_super+0x48/0x70
[c0000007f8607d40] c00000000036b45c deactivate_locked_super+0xac/0xf0
[c0000007f8607d70] c000000000397f94 cleanup_mnt+0x64/0xb0
[c0000007f8607da0] c0000000001287c0 task_work_run+0x140/0x1a0
[c0000007f8607e00] c00000000001ca70 do_notify_resume+0xf0/0x100
[c0000007f8607e30] c00000000000bec4 ret_from_except_lite+0x70/0x74
--- Exception: c00 (System Call) at 00007fff8c6a50a8
SP (7fffee70e770) is in userspace
@sathnaga
Copy link
Member Author

  1. Tried to recreate with multiple reboots, unable to hit the issue
  2. Tried running host stress and fs tests, hit with a different host crash bug, reported @hit with host crash while running host stress tests #20

@cdeadmin
Copy link

While trying to reproduce with host stress, I hit with the below host crash during xfs stress tests

enter ? for help
[link register   ] c0000000002543f0 irq_work_run+0x30/0x50
[c000000ffff53cc0] c000000ffff53cf0 (unreliable)
[c000000ffff53cf0] c0000000001b7ca0 flush_smp_call_function_queue+0xf0/0x200
[c000000ffff53d70] c0000000000477ec smp_ipi_demux_relaxed+0x9c/0x110
[c000000ffff53db0] c0000000000903d4 icp_native_ipi_action+0x64/0x80
[c000000ffff53dd0] c000000000179420 __handle_irq_event_percpu+0x90/0x2d0
[c000000ffff53e90] c000000000179698 handle_irq_event_percpu+0x38/0x90
[c000000ffff53ed0] c00000000017fcf4 handle_percpu_irq+0x84/0xd0
[c000000ffff53f00] c000000000177b7c generic_handle_irq+0x4c/0x80
[c000000ffff53f20] c0000000000165d4 __do_irq+0x94/0x200
[c000000ffff53f90] c000000000029fa4 call_do_irq+0x14/0x24
[c0000007f87f3a50] c0000000000167dc do_IRQ+0x9c/0x110
[c0000007f87f3aa0] c000000000008c58 hardware_interrupt_common+0x158/0x160
--- Exception: 501 (Hardware Interrupt) at c0000000008eb664 snooze_loop+0xa4/0x190
[c0000007f87f3d90] c0000007f87f3dc0 (unreliable)
[c0000007f87f3dd0] c0000000008e83a4 cpuidle_enter_state+0xc4/0x3d0
[c0000007f87f3e30] c00000000015f73c call_cpuidle+0x4c/0x80
[c0000007f87f3e50] c00000000015fbe0 do_idle+0x2b0/0x350
[c0000007f87f3ec0] c00000000015fe8c cpu_startup_entry+0x3c/0x50
[c0000007f87f3ef0] c000000000048a74 start_secondary+0x4e4/0x530
[c0000007f87f3f90] c00000000000b16c start_secondary_prolog+0x10/0x14
b:mon&gt;

jenkins_job_log.txt
looks like this patch , https://www.spinics.net/lists/linux-fsdevel/msg117031.html fixes this issue

malcolmcrossley pushed a commit to malcolmcrossley/linux that referenced this issue Jan 24, 2018
commit e39d200 upstream.

Reported by syzkaller:

  BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
  Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298

  CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ open-power-host-os#18
  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
  Call Trace:
   dump_stack+0xab/0xe1
   print_address_description+0x6b/0x290
   kasan_report+0x28a/0x370
   write_mmio+0x11e/0x270 [kvm]
   emulator_read_write_onepage+0x311/0x600 [kvm]
   emulator_read_write+0xef/0x240 [kvm]
   emulator_fix_hypercall+0x105/0x150 [kvm]
   em_hypercall+0x2b/0x80 [kvm]
   x86_emulate_insn+0x2b1/0x1640 [kvm]
   x86_emulate_instruction+0x39a/0xb90 [kvm]
   handle_exception+0x1b4/0x4d0 [kvm_intel]
   vcpu_enter_guest+0x15a0/0x2640 [kvm]
   kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
   kvm_vcpu_ioctl+0x479/0x880 [kvm]
   do_vfs_ioctl+0x142/0x9a0
   SyS_ioctl+0x74/0x80
   entry_SYSCALL_64_fastpath+0x23/0x9a

The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
to the guest memory, however, write_mmio tracepoint always prints 8 bytes
through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
it by just accessing the bytes which we operate on.

Before patch:

syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f

After patch:

syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
liyi-ibm referenced this issue in liyi-ibm/linux Dec 28, 2018
[ Upstream commit 373c83a ]

Using built-in in kernel image without a firmware in filesystem
or in the kernel image can lead to a kernel NULL pointer deference.
Watchdog need to be stopped in brcmf_sdio_remove

The system is going down NOW!
[ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 000002f8
Sent SIGTERM to all processes
[ 1348.121412] Mem abort info:
[ 1348.126962]   ESR = 0x96000004
[ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
[ 1348.135948]   SET = 0, FnV = 0
[ 1348.138997]   EA = 0, S1PTW = 0
[ 1348.142154] Data abort info:
[ 1348.145045]   ISV = 0, ISS = 0x00000004
[ 1348.148884]   CM = 0, WnR = 0
[ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
[ 1348.158475] [00000000000002f8] pgd=0000000000000000
[ 1348.163364] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 1348.168927] Modules linked in: ipv6
[ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18
[ 1348.180757] Hardware name: Amarula A64-Relic (DT)
[ 1348.185455] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
[ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
[ 1348.200253] sp : ffff00000b85be30
[ 1348.203561] x29: ffff00000b85be30 x28: 0000000000000000
[ 1348.208868] x27: ffff00000b6cb918 x26: ffff80003b990638
[ 1348.214176] x25: ffff0000087b1a20 x24: ffff80003b94f800
[ 1348.219483] x23: ffff000008e620c8 x22: ffff000008f0b660
[ 1348.224790] x21: ffff000008c6a858 x20: 00000000fffffe00
[ 1348.230097] x19: ffff80003b94f800 x18: 0000000000000001
[ 1348.235404] x17: 0000ffffab2e8a74 x16: ffff0000080d7de8
[ 1348.240711] x15: 0000000000000000 x14: 0000000000000400
[ 1348.246018] x13: 0000000000000400 x12: 0000000000000001
[ 1348.251324] x11: 00000000000002c4 x10: 0000000000000a10
[ 1348.256631] x9 : ffff00000b85bc40 x8 : ffff80003be11870
[ 1348.261937] x7 : ffff80003dfc7308 x6 : 000000078ff08b55
[ 1348.267243] x5 : 00000139e1058400 x4 : 0000000000000000
[ 1348.272550] x3 : dead000000000100 x2 : 958f2788d6618100
[ 1348.277856] x1 : 00000000fffffe00 x0 : 0000000000000000

Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants