-
Notifications
You must be signed in to change notification settings - Fork 55.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s #20
s #20
Conversation
The warning below triggers on AMD MCM packages because physical package IDs on the cores of a _physical_ socket are the same. I.e., this field says which CPUs belong to the same physical package. However, the same two CPUs belong to two different internal, i.e. "logical" nodes in the same physical socket which is reflected in the CPU-to-node map on x86 with NUMA. Which makes this check wrong on the above topologies so circumvent it. [ 0.444413] Booting Node 0, Processors #1 #2 #3 #4 #5 Ok. [ 0.461388] ------------[ cut here ]------------ [ 0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81() [ 0.473960] Hardware name: Dinar [ 0.477170] sched: CPU torvalds#6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [ 0.486860] Booting Node 1, Processors torvalds#6 [ 0.491104] Modules linked in: [ 0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1 [ 0.499510] Call Trace: [ 0.501946] [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81 [ 0.508185] [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d [ 0.514163] [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48 [ 0.519881] [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81 [ 0.525943] [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371 [ 0.532004] [<ffffffff8144c4ee>] start_secondary+0x19a/0x218 [ 0.537729] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.628197] torvalds#7 torvalds#8 torvalds#9 torvalds#10 torvalds#11 Ok. [ 0.807108] Booting Node 3, Processors torvalds#12 torvalds#13 torvalds#14 torvalds#15 torvalds#16 torvalds#17 Ok. [ 0.897587] Booting Node 2, Processors torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok. [ 0.917443] Brought up 24 CPUs We ran a topology sanity check test we have here on it and it all looks ok... hopefully :). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
commit 543cc38 upstream. When hibernating ->resume may not be called by usb core, but disconnect and probe instead, so we do not increase the counter after decreasing it in ->supend. As a result we free memory early, and get crash when unplugging usb dongle. BUG: unable to handle kernel paging request at 6b6b6b9f IP: [<c06909b0>] driver_sysfs_remove+0x10/0x30 *pdpt = 0000000034f21001 *pde = 0000000000000000 Pid: 20, comm: khubd Not tainted 3.1.0-rc1-wl+ torvalds#20 LENOVO 6369CTO/6369CTO EIP: 0060:[<c06909b0>] EFLAGS: 00010202 CPU: 1 EIP is at driver_sysfs_remove+0x10/0x30 EAX: 6b6b6b6b EBX: f52bba34 ECX: 00000000 EDX: 6b6b6b6b ESI: 6b6b6b6b EDI: c0a0ea20 EBP: f61c9e68 ESP: f61c9e64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process khubd (pid: 20, ti=f61c8000 task=f6138270 task.ti=f61c8000) Call Trace: [<c06909ef>] __device_release_driver+0x1f/0xa0 [<c0690b20>] device_release_driver+0x20/0x40 [<c068fd64>] bus_remove_device+0x84/0xe0 [<c068e12a>] ? device_remove_attrs+0x2a/0x80 [<c068e267>] device_del+0xe7/0x170 [<c06d93d4>] usb_disconnect+0xd4/0x180 [<c06d9d61>] hub_thread+0x691/0x1600 [<c0473260>] ? wake_up_bit+0x30/0x30 [<c0442a39>] ? complete+0x49/0x60 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c0472eb4>] kthread+0x74/0x80 [<c0472e40>] ? kthread_worker_fn+0x150/0x150 [<c0809b3e>] kernel_thread_helper+0x6/0x10 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit e44aabd upstream. Errata E20: UART: Character Timeout interrupt remains set under certain software conditions. Implication: The software servicing the UART can be trapped in an infinite loop. Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: http://bugs.launchpad.net/bugs/888042 commit 543cc38 upstream. When hibernating ->resume may not be called by usb core, but disconnect and probe instead, so we do not increase the counter after decreasing it in ->supend. As a result we free memory early, and get crash when unplugging usb dongle. BUG: unable to handle kernel paging request at 6b6b6b9f IP: [<c06909b0>] driver_sysfs_remove+0x10/0x30 *pdpt = 0000000034f21001 *pde = 0000000000000000 Pid: 20, comm: khubd Not tainted 3.1.0-rc1-wl+ torvalds#20 LENOVO 6369CTO/6369CTO EIP: 0060:[<c06909b0>] EFLAGS: 00010202 CPU: 1 EIP is at driver_sysfs_remove+0x10/0x30 EAX: 6b6b6b6b EBX: f52bba34 ECX: 00000000 EDX: 6b6b6b6b ESI: 6b6b6b6b EDI: c0a0ea20 EBP: f61c9e68 ESP: f61c9e64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process khubd (pid: 20, ti=f61c8000 task=f6138270 task.ti=f61c8000) Call Trace: [<c06909ef>] __device_release_driver+0x1f/0xa0 [<c0690b20>] device_release_driver+0x20/0x40 [<c068fd64>] bus_remove_device+0x84/0xe0 [<c068e12a>] ? device_remove_attrs+0x2a/0x80 [<c068e267>] device_del+0xe7/0x170 [<c06d93d4>] usb_disconnect+0xd4/0x180 [<c06d9d61>] hub_thread+0x691/0x1600 [<c0473260>] ? wake_up_bit+0x30/0x30 [<c0442a39>] ? complete+0x49/0x60 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c06d96d0>] ? hub_disconnect+0xd0/0xd0 [<c0472eb4>] kthread+0x74/0x80 [<c0472e40>] ? kthread_worker_fn+0x150/0x150 [<c0809b3e>] kernel_thread_helper+0x6/0x10 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com>
Since called by all CPUs, the statically allocated struct irqaction registration was wrong (link list looping) Also slab is already init by now so we can freely use request_irq() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
@@ -1,5 +1,5 @@ | |||
# ========================================================================== | |||
# Building | |||
# Building I love linus tovarlds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
funny :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least fix that typo, "I love Linus Torvalds".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-rep
…d reasons We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The sm_lock spinlock is taken in the process context by mlx4_ib_modify_device, and in the interrupt context by update_sm_ah, so we need to take that spinlock with irqsave, and release it with irqrestore. Lockdeps reports this as follows: [ INFO: inconsistent lock state ] 3.5.0+ #20 Not tainted inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] {HARDIRQ-ON-W} state was registered at: [<ffffffff810b84a0>] mark_irqflags+0x120/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib] [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core] [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core] [<ffffffff8136a150>] dev_attr_store+0x20/0x30 [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170 [<ffffffff8118da38>] vfs_write+0xc8/0x190 [<ffffffff8118dc01>] sys_write+0x51/0x90 [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b ... *** DEADLOCK *** 1 lock held by swapper/0/0: stack backtrace: Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20 Call Trace: <IRQ> [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190 [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210 [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280 [<ffffffff810b8290>] mark_lock+0x150/0x240 [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core] [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810c27c3>] ? is_module_address+0x23/0x30 [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib] [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib] [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70 [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core] [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core] Reported by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton at redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust at netapp.com> Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 9b469a6 upstream. Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. [bmork: backported to 3.4: use driver whitelisting] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 9b469a6 upstream. Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. [bmork: backported to 3.4: use driver whitelisting] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Since called by all CPUs, the statically allocated struct irqaction registration was wrong (link list looping) Also slab is already init by now so we can freely use request_irq() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
=========================>8============================= Linux version 3.6.0+ (vineetg@vineetg-Latitude) (gcc version 4.4.7 (ARCompact elf32 toolchain (built 20120928)) ) #20 Wed Oct 17 18:02:39 IST 2012 [plat-arcfpga]: registering early dev resources bootconsole [early_ARCuart0] enabled pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768 pcpu-alloc: [0] 0 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32624 Kernel command line: print-fatal-signals=1 PID hash table entries: 1024 (order: -1, 4096 bytes) Dentry cache hash table entries: 32768 (order: 4, 131072 bytes) Inode-cache hash table entries: 16384 (order: 3, 65536 bytes) Memory Available: 249M / 256M (1223K code, 408K data, 3904K init, 1400K reserv) SLUB: Genslabs=12, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 NR_IRQS:16 Device [ARC Timer0] clockevent mode now [1] Device [ARC Timer0] clockevent mode now [2] Console: colour dummy device 80x25 Calibrating delay loop... 39.73 BogoMIPS (lpj=198656) pid_max: default: 4096 minimum: 301 Mount-cache hash table entries: 1024 devtmpfs: initialized [plat-arcfpga]: registering device resources bio: create slab <bio-0> at 0 Switching to clocksource ARC Timer1 io scheduler noop registered (default) Serial: ARC serial driver: platform register arc-uart.0: ttyARC0 at MMIO 0xc0fc1000 (irq = 5) is a arc-uart console [ttyARC0] enabled, bootconsole disabled console [ttyARC0] enabled, bootconsole disabled mousedev: PS/2 mouse device common for all mice Warning: unable to open an initial console. Freeing unused kernel memory: 3904k [80002000] to [803d2000] Mounting proc Mounting sysfs Mounting devpts Setting hostname to ARCLinux Starting System logger (syslogd) Bringing up loopback device ifconfig: socket: Function not implemented route: socket: Function not implemented Disk not detected ! Mounting tmpfs /etc/init.d/rcS: line 76: can't create /proc/sys/kernel/msgmni: nonexistent directory Please press Enter to activate this console. *********************************************************************** Welcome to ARCLinux *********************************************************************** [ARCLinux]$ =========================>8============================= Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
…d reasons BugLink: http://bugs.launchpad.net/bugs/1035435 commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit e44aabd upstream. Errata E20: UART: Character Timeout interrupt remains set under certain software conditions. Implication: The software servicing the UART can be trapped in an infinite loop. Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since called by all CPUs, the statically allocated struct irqaction registration was wrong (link list looping) Also slab is already init by now so we can freely use request_irq() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
…s_lock [ Upstream commit be26ba9 ] For storing a value to a queue attribute, the queue_attr_store function first freezes the queue (->q_usage_counter(io)) and then acquire ->sysfs_lock. This seems not correct as the usual ordering should be to acquire ->sysfs_lock before freezing the queue. This incorrect ordering causes the following lockdep splat which we are able to reproduce always simply by accessing /sys/kernel/debug file using ls command: [ 57.597146] WARNING: possible circular locking dependency detected [ 57.597154] 6.12.0-10553-gb86545e02e8c torvalds#20 Tainted: G W [ 57.597162] ------------------------------------------------------ [ 57.597168] ls/4605 is trying to acquire lock: [ 57.597176] c00000003eb56710 (&mm->mmap_lock){++++}-{4:4}, at: __might_fault+0x58/0xc0 [ 57.597200] but task is already holding lock: [ 57.597207] c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 [ 57.597226] which lock already depends on the new lock. [ 57.597233] the existing dependency chain (in reverse order) is: [ 57.597241] -> #5 (&sb->s_type->i_mutex_key#3){++++}-{4:4}: [ 57.597255] down_write+0x6c/0x18c [ 57.597264] start_creating+0xb4/0x24c [ 57.597274] debugfs_create_dir+0x2c/0x1e8 [ 57.597283] blk_register_queue+0xec/0x294 [ 57.597292] add_disk_fwnode+0x2e4/0x548 [ 57.597302] brd_alloc+0x2c8/0x338 [ 57.597309] brd_init+0x100/0x178 [ 57.597317] do_one_initcall+0x88/0x3e4 [ 57.597326] kernel_init_freeable+0x3cc/0x6e0 [ 57.597334] kernel_init+0x34/0x1cc [ 57.597342] ret_from_kernel_user_thread+0x14/0x1c [ 57.597350] -> #4 (&q->debugfs_mutex){+.+.}-{4:4}: [ 57.597362] __mutex_lock+0xfc/0x12a0 [ 57.597370] blk_register_queue+0xd4/0x294 [ 57.597379] add_disk_fwnode+0x2e4/0x548 [ 57.597388] brd_alloc+0x2c8/0x338 [ 57.597395] brd_init+0x100/0x178 [ 57.597402] do_one_initcall+0x88/0x3e4 [ 57.597410] kernel_init_freeable+0x3cc/0x6e0 [ 57.597418] kernel_init+0x34/0x1cc [ 57.597426] ret_from_kernel_user_thread+0x14/0x1c [ 57.597434] -> #3 (&q->sysfs_lock){+.+.}-{4:4}: [ 57.597446] __mutex_lock+0xfc/0x12a0 [ 57.597454] queue_attr_store+0x9c/0x110 [ 57.597462] sysfs_kf_write+0x70/0xb0 [ 57.597471] kernfs_fop_write_iter+0x1b0/0x2ac [ 57.597480] vfs_write+0x3dc/0x6e8 [ 57.597488] ksys_write+0x84/0x140 [ 57.597495] system_call_exception+0x130/0x360 [ 57.597504] system_call_common+0x160/0x2c4 [ 57.597516] -> #2 (&q->q_usage_counter(io)torvalds#21){++++}-{0:0}: [ 57.597530] __submit_bio+0x5ec/0x828 [ 57.597538] submit_bio_noacct_nocheck+0x1e4/0x4f0 [ 57.597547] iomap_readahead+0x2a0/0x448 [ 57.597556] xfs_vm_readahead+0x28/0x3c [ 57.597564] read_pages+0x88/0x41c [ 57.597571] page_cache_ra_unbounded+0x1ac/0x2d8 [ 57.597580] filemap_get_pages+0x188/0x984 [ 57.597588] filemap_read+0x13c/0x4bc [ 57.597596] xfs_file_buffered_read+0x88/0x17c [ 57.597605] xfs_file_read_iter+0xac/0x158 [ 57.597614] vfs_read+0x2d4/0x3b4 [ 57.597622] ksys_read+0x84/0x144 [ 57.597629] system_call_exception+0x130/0x360 [ 57.597637] system_call_common+0x160/0x2c4 [ 57.597647] -> #1 (mapping.invalidate_lock#2){++++}-{4:4}: [ 57.597661] down_read+0x6c/0x220 [ 57.597669] filemap_fault+0x870/0x100c [ 57.597677] xfs_filemap_fault+0xc4/0x18c [ 57.597684] __do_fault+0x64/0x164 [ 57.597693] __handle_mm_fault+0x1274/0x1dac [ 57.597702] handle_mm_fault+0x248/0x484 [ 57.597711] ___do_page_fault+0x428/0xc0c [ 57.597719] hash__do_page_fault+0x30/0x68 [ 57.597727] do_hash_fault+0x90/0x35c [ 57.597736] data_access_common_virt+0x210/0x220 [ 57.597745] _copy_from_user+0xf8/0x19c [ 57.597754] sel_write_load+0x178/0xd54 [ 57.597762] vfs_write+0x108/0x6e8 [ 57.597769] ksys_write+0x84/0x140 [ 57.597777] system_call_exception+0x130/0x360 [ 57.597785] system_call_common+0x160/0x2c4 [ 57.597794] -> #0 (&mm->mmap_lock){++++}-{4:4}: [ 57.597806] __lock_acquire+0x17cc/0x2330 [ 57.597814] lock_acquire+0x138/0x400 [ 57.597822] __might_fault+0x7c/0xc0 [ 57.597830] filldir64+0xe8/0x390 [ 57.597839] dcache_readdir+0x80/0x2d4 [ 57.597846] iterate_dir+0xd8/0x1d4 [ 57.597855] sys_getdents64+0x88/0x2d4 [ 57.597864] system_call_exception+0x130/0x360 [ 57.597872] system_call_common+0x160/0x2c4 [ 57.597881] other info that might help us debug this: [ 57.597888] Chain exists of: &mm->mmap_lock --> &q->debugfs_mutex --> &sb->s_type->i_mutex_key#3 [ 57.597905] Possible unsafe locking scenario: [ 57.597911] CPU0 CPU1 [ 57.597917] ---- ---- [ 57.597922] rlock(&sb->s_type->i_mutex_key#3); [ 57.597932] lock(&q->debugfs_mutex); [ 57.597940] lock(&sb->s_type->i_mutex_key#3); [ 57.597950] rlock(&mm->mmap_lock); [ 57.597958] *** DEADLOCK *** [ 57.597965] 2 locks held by ls/4605: [ 57.597971] #0: c0000000137c12f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0xcc/0x154 [ 57.597989] #1: c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 Prevent the above lockdep warning by acquiring ->sysfs_lock before freezing the queue while storing a queue attribute in queue_attr_store function. Later, we also found[1] another function __blk_mq_update_nr_ hw_queues where we first freeze queue and then acquire the ->sysfs_lock. So we've also updated lock ordering in __blk_mq_update_nr_hw_queues function and ensured that in all code paths we follow the correct lock ordering i.e. acquire ->sysfs_lock before freezing the queue. [1] https://lore.kernel.org/all/CAFj5m9Ke8+EHKQBs_Nk6hqd=LGXtk4mUxZUN5==ZcCjnZSBwHw@mail.gmail.com/ Reported-by: kjain@linux.ibm.com Fixes: af28141 ("block: freeze the queue in queue_attr_store") Tested-by: kjain@linux.ibm.com Cc: hch@lst.de Cc: axboe@kernel.dk Cc: ritesh.list@gmail.com Cc: ming.lei@redhat.com Cc: gjoyce@linux.ibm.com Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241210144222.1066229-1-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
…s_lock [ Upstream commit be26ba9 ] For storing a value to a queue attribute, the queue_attr_store function first freezes the queue (->q_usage_counter(io)) and then acquire ->sysfs_lock. This seems not correct as the usual ordering should be to acquire ->sysfs_lock before freezing the queue. This incorrect ordering causes the following lockdep splat which we are able to reproduce always simply by accessing /sys/kernel/debug file using ls command: [ 57.597146] WARNING: possible circular locking dependency detected [ 57.597154] 6.12.0-10553-gb86545e02e8c torvalds#20 Tainted: G W [ 57.597162] ------------------------------------------------------ [ 57.597168] ls/4605 is trying to acquire lock: [ 57.597176] c00000003eb56710 (&mm->mmap_lock){++++}-{4:4}, at: __might_fault+0x58/0xc0 [ 57.597200] but task is already holding lock: [ 57.597207] c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 [ 57.597226] which lock already depends on the new lock. [ 57.597233] the existing dependency chain (in reverse order) is: [ 57.597241] -> #5 (&sb->s_type->i_mutex_key#3){++++}-{4:4}: [ 57.597255] down_write+0x6c/0x18c [ 57.597264] start_creating+0xb4/0x24c [ 57.597274] debugfs_create_dir+0x2c/0x1e8 [ 57.597283] blk_register_queue+0xec/0x294 [ 57.597292] add_disk_fwnode+0x2e4/0x548 [ 57.597302] brd_alloc+0x2c8/0x338 [ 57.597309] brd_init+0x100/0x178 [ 57.597317] do_one_initcall+0x88/0x3e4 [ 57.597326] kernel_init_freeable+0x3cc/0x6e0 [ 57.597334] kernel_init+0x34/0x1cc [ 57.597342] ret_from_kernel_user_thread+0x14/0x1c [ 57.597350] -> #4 (&q->debugfs_mutex){+.+.}-{4:4}: [ 57.597362] __mutex_lock+0xfc/0x12a0 [ 57.597370] blk_register_queue+0xd4/0x294 [ 57.597379] add_disk_fwnode+0x2e4/0x548 [ 57.597388] brd_alloc+0x2c8/0x338 [ 57.597395] brd_init+0x100/0x178 [ 57.597402] do_one_initcall+0x88/0x3e4 [ 57.597410] kernel_init_freeable+0x3cc/0x6e0 [ 57.597418] kernel_init+0x34/0x1cc [ 57.597426] ret_from_kernel_user_thread+0x14/0x1c [ 57.597434] -> #3 (&q->sysfs_lock){+.+.}-{4:4}: [ 57.597446] __mutex_lock+0xfc/0x12a0 [ 57.597454] queue_attr_store+0x9c/0x110 [ 57.597462] sysfs_kf_write+0x70/0xb0 [ 57.597471] kernfs_fop_write_iter+0x1b0/0x2ac [ 57.597480] vfs_write+0x3dc/0x6e8 [ 57.597488] ksys_write+0x84/0x140 [ 57.597495] system_call_exception+0x130/0x360 [ 57.597504] system_call_common+0x160/0x2c4 [ 57.597516] -> #2 (&q->q_usage_counter(io)torvalds#21){++++}-{0:0}: [ 57.597530] __submit_bio+0x5ec/0x828 [ 57.597538] submit_bio_noacct_nocheck+0x1e4/0x4f0 [ 57.597547] iomap_readahead+0x2a0/0x448 [ 57.597556] xfs_vm_readahead+0x28/0x3c [ 57.597564] read_pages+0x88/0x41c [ 57.597571] page_cache_ra_unbounded+0x1ac/0x2d8 [ 57.597580] filemap_get_pages+0x188/0x984 [ 57.597588] filemap_read+0x13c/0x4bc [ 57.597596] xfs_file_buffered_read+0x88/0x17c [ 57.597605] xfs_file_read_iter+0xac/0x158 [ 57.597614] vfs_read+0x2d4/0x3b4 [ 57.597622] ksys_read+0x84/0x144 [ 57.597629] system_call_exception+0x130/0x360 [ 57.597637] system_call_common+0x160/0x2c4 [ 57.597647] -> #1 (mapping.invalidate_lock#2){++++}-{4:4}: [ 57.597661] down_read+0x6c/0x220 [ 57.597669] filemap_fault+0x870/0x100c [ 57.597677] xfs_filemap_fault+0xc4/0x18c [ 57.597684] __do_fault+0x64/0x164 [ 57.597693] __handle_mm_fault+0x1274/0x1dac [ 57.597702] handle_mm_fault+0x248/0x484 [ 57.597711] ___do_page_fault+0x428/0xc0c [ 57.597719] hash__do_page_fault+0x30/0x68 [ 57.597727] do_hash_fault+0x90/0x35c [ 57.597736] data_access_common_virt+0x210/0x220 [ 57.597745] _copy_from_user+0xf8/0x19c [ 57.597754] sel_write_load+0x178/0xd54 [ 57.597762] vfs_write+0x108/0x6e8 [ 57.597769] ksys_write+0x84/0x140 [ 57.597777] system_call_exception+0x130/0x360 [ 57.597785] system_call_common+0x160/0x2c4 [ 57.597794] -> #0 (&mm->mmap_lock){++++}-{4:4}: [ 57.597806] __lock_acquire+0x17cc/0x2330 [ 57.597814] lock_acquire+0x138/0x400 [ 57.597822] __might_fault+0x7c/0xc0 [ 57.597830] filldir64+0xe8/0x390 [ 57.597839] dcache_readdir+0x80/0x2d4 [ 57.597846] iterate_dir+0xd8/0x1d4 [ 57.597855] sys_getdents64+0x88/0x2d4 [ 57.597864] system_call_exception+0x130/0x360 [ 57.597872] system_call_common+0x160/0x2c4 [ 57.597881] other info that might help us debug this: [ 57.597888] Chain exists of: &mm->mmap_lock --> &q->debugfs_mutex --> &sb->s_type->i_mutex_key#3 [ 57.597905] Possible unsafe locking scenario: [ 57.597911] CPU0 CPU1 [ 57.597917] ---- ---- [ 57.597922] rlock(&sb->s_type->i_mutex_key#3); [ 57.597932] lock(&q->debugfs_mutex); [ 57.597940] lock(&sb->s_type->i_mutex_key#3); [ 57.597950] rlock(&mm->mmap_lock); [ 57.597958] *** DEADLOCK *** [ 57.597965] 2 locks held by ls/4605: [ 57.597971] #0: c0000000137c12f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0xcc/0x154 [ 57.597989] #1: c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 Prevent the above lockdep warning by acquiring ->sysfs_lock before freezing the queue while storing a queue attribute in queue_attr_store function. Later, we also found[1] another function __blk_mq_update_nr_ hw_queues where we first freeze queue and then acquire the ->sysfs_lock. So we've also updated lock ordering in __blk_mq_update_nr_hw_queues function and ensured that in all code paths we follow the correct lock ordering i.e. acquire ->sysfs_lock before freezing the queue. [1] https://lore.kernel.org/all/CAFj5m9Ke8+EHKQBs_Nk6hqd=LGXtk4mUxZUN5==ZcCjnZSBwHw@mail.gmail.com/ Reported-by: kjain@linux.ibm.com Fixes: af28141 ("block: freeze the queue in queue_attr_store") Tested-by: kjain@linux.ibm.com Cc: hch@lst.de Cc: axboe@kernel.dk Cc: ritesh.list@gmail.com Cc: ming.lei@redhat.com Cc: gjoyce@linux.ibm.com Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241210144222.1066229-1-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Compat mode (CONFIG_COMPAT) allows application binaries built for a 32bit architecture to issue system calls to a 64bit kernel which implements such compatibility support. This work builds on the y2038 sanitization which eliminated the remaining assumptions about the size of native long integers in the same move. Although these changes enable armv7 binaries to call an armv8 kernel, most of them are architecture-independent, and follow this pattern: - .compat_ioctl and .compat_oob_ioctl handlers are added to all file operations structs, pointing at compat_ptr_ioctl() compat_ptr_oob_ioctl() respectively. - all pointers passed by applications within ioctl() argument structs are conveyed as generic 64bit values. These changes mandate upgrading the ABI revision to torvalds#20. Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Compat mode (CONFIG_COMPAT) allows application binaries built for a 32bit architecture to issue system calls to a 64bit kernel which implements such compatibility support. This work builds on the y2038 sanitization which eliminated the remaining assumptions about the size of native long integers in the same move. Although these changes enable armv7 binaries to call an armv8 kernel, most of them are architecture-independent, and follow this pattern: - .compat_ioctl and .compat_oob_ioctl handlers are added to all file operations structs, pointing at compat_ptr_ioctl() compat_ptr_oob_ioctl() respectively. - all pointers passed by applications within ioctl() argument structs are conveyed as generic 64bit values. These changes mandate upgrading the ABI revision to torvalds#20. Signed-off-by: Philippe Gerum <rpm@xenomai.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Compat mode (CONFIG_COMPAT) allows application binaries built for a 32bit architecture to issue system calls to a 64bit kernel which implements such compatibility support. This work builds on the y2038 sanitization which eliminated the remaining assumptions about the size of native long integers in the same move. Although these changes enable armv7 binaries to call an armv8 kernel, most of them are architecture-independent, and follow this pattern: - .compat_ioctl and .compat_oob_ioctl handlers are added to all file operations structs, pointing at compat_ptr_ioctl() compat_ptr_oob_ioctl() respectively. - all pointers passed by applications within ioctl() argument structs are conveyed as generic 64bit values. These changes mandate upgrading the ABI revision to torvalds#20. Signed-off-by: Philippe Gerum <rpm@xenomai.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
fix UBSAN false positive
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
I found a NULL pointer dereference as followed: BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty torvalds#20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1. RIP: 0010:has_unmovable_pages+0x184/0x360 ... Call Trace: <TASK> set_migratetype_isolate+0xd1/0x180 start_isolate_page_range+0xd2/0x170 alloc_contig_range_noprof+0x101/0x660 alloc_contig_pages_noprof+0x238/0x290 alloc_gigantic_folio.isra.0+0xb6/0x1f0 only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60 alloc_pool_huge_folio+0x80/0xf0 set_max_huge_pages+0x211/0x490 __nr_hugepages_store_common+0x5f/0xe0 nr_hugepages_store+0x77/0x80 kernfs_fop_write_iter+0x118/0x200 vfs_write+0x23c/0x3f0 ksys_write+0x62/0xe0 do_syscall_64+0x5b/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there is a race to free the HugeTLB page between PageHuge() and folio_hstate(). There is no need to add hugetlb_lock here as the HugeTLB page can be freed in lot of places. So it's enough to unfold folio_hstate() and add a check to avoid NULL pointer dereference for hugepage_migration_supported(). Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com Fixes: 464c7ff ("mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
sup