Skip to content

Conversation

@Leo-Yan
Copy link

@Leo-Yan Leo-Yan commented Jun 16, 2016

Current code use dev_err to output log if kernel cannot bind any one
sensor in initialization. This is a strictly checking to make sure all
four sensors have to be used for system.

After enable thermal power allocator, usually system only use one sensor
for multiple actors. This patch changes to use dev_dbg so that avoid
confusion in boot log.

Signed-off-by: Leo Yan leo.yan@linaro.org

Current code use dev_err to output log if kernel cannot bind any one
sensor in initialization. This is a strictly checking to make sure all
four sensors have to be used for system.

After enable thermal power allocator, usually system only use one sensor
for multiple actors. This patch changes to use dev_dbg so that avoid
confusion in boot log.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
if (ret)
dev_err(&pdev->dev,
dev_dbg(&pdev->dev,
"failed to register thermal sensor: %d\n", ret);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two dev_dbg() are printing duplicate messages. Can you delete one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have fixed with new PR: #17

On 16 June 2016 at 16:24, Guodong Xu notifications@github.com wrote:

In drivers/thermal/hisi_thermal.c
#16 (comment):

@@ -351,7 +351,7 @@ static int hisi_thermal_probe(struct platform_device *pdev)
ret = hisi_thermal_register_sensor(pdev, data,
&data->sensors[i], i);
if (ret)

  •       dev_err(&pdev->dev,
    
  •       dev_dbg(&pdev->dev,
            "failed to register thermal sensor: %d\n", ret);
    

The two dev_dbg() are printing duplicate messages. Can you delete one?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/96boards-hikey/linux/pull/16/files/ad4fff1acec3de7b6a12ed56e8931714def424ac#r67305005,
or mute the thread
https://github.com/notifications/unsubscribe/AKsXkpBJAFpp2DriURyUcxR8Q0-z1pcNks5qMQhAgaJpZM4I3GLP
.

@docularxu docularxu closed this Jun 17, 2016
gaozhangfei pushed a commit that referenced this pull request Jan 17, 2017
Andrey Konovalov reports that fuzz testing with syzkaller causes a
KASAN use-after-free bug report in gadgetfs:

BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr ffff88003dfe5bf2
Read of size 2 by task syz-executor0/22994
CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006df06a18 ffffffff81f96aba ffffffffe0528500 1ffff1000dbe0cd6
 ffffed000dbe0cce ffff88006df068f0 0000000041b58ab3 ffffffff8598b4c8
 ffffffff81f96828 1ffff1000dbe0ccd ffff88006df06708 ffff88006df06748
Call Trace:
 <IRQ> [  201.343209]  [<     inline     >] __dump_stack lib/dump_stack.c:15
 <IRQ> [  201.343209]  [<ffffffff81f96aba>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff817e4dec>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
 [<     inline     >] print_address_description mm/kasan/report.c:197
 [<ffffffff817e5080>] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
 [<     inline     >] kasan_report mm/kasan/report.c:306
 [<ffffffff817e562a>] __asan_report_load_n_noabort+0x3a/0x40 mm/kasan/report.c:337
 [<     inline     >] config_buf drivers/usb/gadget/legacy/inode.c:1298
 [<ffffffff8322c8fa>] gadgetfs_setup+0x208a/0x20e0 drivers/usb/gadget/legacy/inode.c:1368
 [<ffffffff830fdcd0>] dummy_timer+0x11f0/0x36d0 drivers/usb/gadget/udc/dummy_hcd.c:1858
 [<ffffffff814807c1>] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308
 [<     inline     >] expire_timers kernel/time/timer.c:1348
 [<ffffffff81482de6>] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641
 [<ffffffff814832c1>] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654
 [<ffffffff84f4af8b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

The cause of the bug is subtle.  The dev_config() routine gets called
twice by the fuzzer.  The first time, the user data contains both a
full-speed configuration descriptor and a high-speed config
descriptor, causing dev->hs_config to be set.  But it also contains an
invalid device descriptor, so the buffer containing the descriptors is
deallocated and dev_config() returns an error.

The second time dev_config() is called, the user data contains only a
full-speed config descriptor.  But dev->hs_config still has the stale
pointer remaining from the first call, causing the routine to think
that there is a valid high-speed config.  Later on, when the driver
dereferences the stale pointer to copy that descriptor, we get a
use-after-free access.

The fix is simple: Clear dev->hs_config if the passed-in data does not
contain a high-speed config descriptor.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
docularxu pushed a commit that referenced this pull request Mar 15, 2017
Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."

The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize.  Fix that.

[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946]  ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901]  gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017]  svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257]  ? __enqueue_entity+0x6c/0x70
[120408.589101]  svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212]  ? try_to_wake_up+0x4a/0x360
[120408.591036]  ? wake_up_process+0x15/0x20
[120408.592093]  ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177]  svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168]  svc_process_common+0x203/0x710 [sunrpc]
[120408.595220]  svc_process+0x105/0x1c0 [sunrpc]
[120408.596278]  nfsd+0xe9/0x160 [nfsd]
[120408.597060]  kthread+0x101/0x140
[120408.597734]  ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626]  ? kthread_park+0x90/0x90
[120408.599448]  ret_from_fork+0x22/0x30

Fixes: 1d65833 "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: stable@vger.kernel.org
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
docularxu pushed a commit that referenced this pull request Apr 10, 2017
commit 034dd34 upstream.

Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."

The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize.  Fix that.

[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946]  ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901]  gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017]  svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257]  ? __enqueue_entity+0x6c/0x70
[120408.589101]  svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212]  ? try_to_wake_up+0x4a/0x360
[120408.591036]  ? wake_up_process+0x15/0x20
[120408.592093]  ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177]  svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168]  svc_process_common+0x203/0x710 [sunrpc]
[120408.595220]  svc_process+0x105/0x1c0 [sunrpc]
[120408.596278]  nfsd+0xe9/0x160 [nfsd]
[120408.597060]  kthread+0x101/0x140
[120408.597734]  ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626]  ? kthread_park+0x90/0x90
[120408.599448]  ret_from_fork+0x22/0x30

Fixes: 1d65833 "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
docularxu pushed a commit that referenced this pull request Apr 10, 2017
[ Upstream commit 45caeaa ]

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 #8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 #9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
docularxu pushed a commit that referenced this pull request Apr 10, 2017
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 #9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
 fs/xfs/xfs_bmap_util.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
docularxu pushed a commit that referenced this pull request May 24, 2017
[ Upstream commit ddc665a ]

When the instruction right before the branch destination is
a 64 bit load immediate, we currently calculate the wrong
jump offset in the ctx->offset[] array as we only account
one instruction slot for the 64 bit load immediate although
it uses two BPF instructions. Fix it up by setting the offset
into the right slot after we incremented the index.

Before (ldimm64 test 1):

  [...]
  00000020:  52800007  mov w7, #0x0 // #0
  00000024:  d2800060  mov x0, #0x3 // #3
  00000028:  d2800041  mov x1, #0x2 // #2
  0000002c:  eb01001f  cmp x0, x1
  00000030:  54ffff82  b.cs 0x00000020
  00000034:  d29fffe7  mov x7, #0xffff // #65535
  00000038:  f2bfffe7  movk x7, #0xffff, lsl #16
  0000003c:  f2dfffe7  movk x7, #0xffff, lsl #32
  00000040:  f2ffffe7  movk x7, #0xffff, lsl #48
  00000044:  d29dddc7  mov x7, #0xeeee // #61166
  00000048:  f2bdddc7  movk x7, #0xeeee, lsl #16
  0000004c:  f2ddddc7  movk x7, #0xeeee, lsl #32
  00000050:  f2fdddc7  movk x7, #0xeeee, lsl #48
  [...]

After (ldimm64 test 1):

  [...]
  00000020:  52800007  mov w7, #0x0 // #0
  00000024:  d2800060  mov x0, #0x3 // #3
  00000028:  d2800041  mov x1, #0x2 // #2
  0000002c:  eb01001f  cmp x0, x1
  00000030:  540000a2  b.cs 0x00000044
  00000034:  d29fffe7  mov x7, #0xffff // #65535
  00000038:  f2bfffe7  movk x7, #0xffff, lsl #16
  0000003c:  f2dfffe7  movk x7, #0xffff, lsl #32
  00000040:  f2ffffe7  movk x7, #0xffff, lsl #48
  00000044:  d29dddc7  mov x7, #0xeeee // #61166
  00000048:  f2bdddc7  movk x7, #0xeeee, lsl #16
  0000004c:  f2ddddc7  movk x7, #0xeeee, lsl #32
  00000050:  f2fdddc7  movk x7, #0xeeee, lsl #48
  [...]

Also, add a couple of test cases to make sure JITs pass
this test. Tested on Cavium ThunderX ARMv8. The added
test cases all pass after the fix.

Fixes: 8eee539 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()")
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
docularxu pushed a commit that referenced this pull request Jun 27, 2017
Andrey Konovalov reports that fuzz testing with syzkaller causes a
KASAN use-after-free bug report in gadgetfs:

BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr ffff88003dfe5bf2
Read of size 2 by task syz-executor0/22994
CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006df06a18 ffffffff81f96aba ffffffffe0528500 1ffff1000dbe0cd6
 ffffed000dbe0cce ffff88006df068f0 0000000041b58ab3 ffffffff8598b4c8
 ffffffff81f96828 1ffff1000dbe0ccd ffff88006df06708 ffff88006df06748
Call Trace:
 <IRQ> [  201.343209]  [<     inline     >] __dump_stack lib/dump_stack.c:15
 <IRQ> [  201.343209]  [<ffffffff81f96aba>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff817e4dec>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
 [<     inline     >] print_address_description mm/kasan/report.c:197
 [<ffffffff817e5080>] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
 [<     inline     >] kasan_report mm/kasan/report.c:306
 [<ffffffff817e562a>] __asan_report_load_n_noabort+0x3a/0x40 mm/kasan/report.c:337
 [<     inline     >] config_buf drivers/usb/gadget/legacy/inode.c:1298
 [<ffffffff8322c8fa>] gadgetfs_setup+0x208a/0x20e0 drivers/usb/gadget/legacy/inode.c:1368
 [<ffffffff830fdcd0>] dummy_timer+0x11f0/0x36d0 drivers/usb/gadget/udc/dummy_hcd.c:1858
 [<ffffffff814807c1>] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308
 [<     inline     >] expire_timers kernel/time/timer.c:1348
 [<ffffffff81482de6>] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641
 [<ffffffff814832c1>] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654
 [<ffffffff84f4af8b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

The cause of the bug is subtle.  The dev_config() routine gets called
twice by the fuzzer.  The first time, the user data contains both a
full-speed configuration descriptor and a high-speed config
descriptor, causing dev->hs_config to be set.  But it also contains an
invalid device descriptor, so the buffer containing the descriptors is
deallocated and dev_config() returns an error.

The second time dev_config() is called, the user data contains only a
full-speed config descriptor.  But dev->hs_config still has the stale
pointer remaining from the first call, causing the routine to think
that there is a valid high-speed config.  Later on, when the driver
dereferences the stale pointer to copy that descriptor, we get a
use-after-free access.

The fix is simple: Clear dev->hs_config if the passed-in data does not
contain a high-speed config descriptor.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
docularxu pushed a commit that referenced this pull request Aug 7, 2017
Preempt can occur in the preemption timer expiration handler:

          CPU0                    CPU1

  preemption timer vmexit
  handle_preemption_timer(vCPU0)
    kvm_lapic_expired_hv_timer
      hv_timer_is_use == true
  sched_out
                           sched_in
                           kvm_arch_vcpu_load
                             kvm_lapic_restart_hv_timer
                               restart_apic_timer
                                 start_hv_timer
                                   already-expired timer or sw timer triggerd in the window
                                 start_sw_timer
                                   cancel_hv_timer
                           /* back in kvm_lapic_expired_hv_timer */
                           cancel_hv_timer
                             WARN_ON(!apic->lapic_timer.hv_timer_in_use);  ==> Oops

This can be reproduced if CONFIG_PREEMPT is enabled.

------------[ cut here ]------------
 WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
 CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G           OE   4.13.0-rc2+ #16
 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
Call Trace:
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xb8/0xd70 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25
 ------------[ cut here ]------------
 WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
 CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0-rc2+ #16
 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
Call Trace:
  kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm]
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xb8/0xd70 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25

This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer
and start_sw_timer be in preemption-disabled regions, which trivially
avoid any reentrancy issue with preempt notifier.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Add more WARNs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
docularxu pushed a commit that referenced this pull request Aug 23, 2017
syzkaller got crashes with CONFIG_HARDENED_USERCOPY=y configs.

Issue here is that recvfrom() can be used with user buffer of Z bytes,
and SO_PEEK_OFF of X bytes, from a skb with Y bytes, and following
condition :

Z < X < Y

kernel BUG at mm/usercopy.c:72!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 2917 Comm: syzkaller842281 Not tainted 4.13.0-rc3+ #16
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
task: ffff8801d2fa40c0 task.stack: ffff8801d1fe8000
RIP: 0010:report_usercopy mm/usercopy.c:64 [inline]
RIP: 0010:__check_object_size+0x3ad/0x500 mm/usercopy.c:264
RSP: 0018:ffff8801d1fef8a8 EFLAGS: 00010286
RAX: 0000000000000078 RBX: ffffffff847102c0 RCX: 0000000000000000
RDX: 0000000000000078 RSI: 1ffff1003a3fded5 RDI: ffffed003a3fdf09
RBP: ffff8801d1fef998 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d1ea480e
R13: fffffffffffffffa R14: ffffffff84710280 R15: dffffc0000000000
FS:  0000000001360880(0000) GS:ffff8801dc000000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000202ecfe4 CR3: 00000001d1ff8000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 check_object_size include/linux/thread_info.h:108 [inline]
 check_copy_size include/linux/thread_info.h:139 [inline]
 copy_to_iter include/linux/uio.h:105 [inline]
 copy_linear_skb include/net/udp.h:371 [inline]
 udpv6_recvmsg+0x1040/0x1af0 net/ipv6/udp.c:395
 inet_recvmsg+0x14c/0x5f0 net/ipv4/af_inet.c:793
 sock_recvmsg_nosec net/socket.c:792 [inline]
 sock_recvmsg+0xc9/0x110 net/socket.c:799
 SYSC_recvfrom+0x2d6/0x570 net/socket.c:1788
 SyS_recvfrom+0x40/0x50 net/socket.c:1760
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: b65ac44 ("udp: try to avoid 2 cache miss on dequeue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@Leo-Yan Leo-Yan deleted the fix_thermal_sensor_err_log branch March 27, 2019 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants