-
Notifications
You must be signed in to change notification settings - Fork 57.1k
small cleanup in CREDITS #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I believe pull requests like this are not used. https://github.com/torvalds/linux/blob/master/Documentation/HOWTO I recommend reading that. |
Thanks Gunni! Now that I have read my way through the documentation, three small questions before I close this:
|
This is considered a trivial patch. Please consult |
Well, according to https://github.com/torvalds/linux/blob/master/Documentation/SubmittingPatches#L200 you should mail Jiri Kosina trivial@kernel.org. https://github.com/torvalds/linux/blob/master/MAINTAINERS#L6515 However since kernel.org is down i can not say where this should be sent, maybe use vger as a replacement and note that you're mailing there because the correct mail is down. EDIT: yeah or use the mail sc68cal submitted: Jiri Kosina jkosina@suse.cz. |
Thanks a lot! I'll email my patch to Jiri Kosina. |
You also need to amend your commit message to add a sign off line. You need that in the commit message or your patch will not be accepted. Check the docs for more info |
When the cgroup base was allocated with kmalloc, it was necessary to annotate the variable with kmemleak_not_leak(). But because it has recently been changed to be allocated with alloc_page() (which skips kmemleak checks) causes a warning on boot up. I was triggering this output: allocated 8388608 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups kmemleak: Trying to color unknown object at 0xf5840000 as Grey Pid: 0, comm: swapper Not tainted 3.0.0-test #12 Call Trace: [<c17e34e6>] ? printk+0x1d/0x1f^M [<c10e2941>] paint_ptr+0x4f/0x78 [<c178ab57>] kmemleak_not_leak+0x58/0x7d [<c108ae9f>] ? __rcu_read_unlock+0x9/0x7d [<c1cdb462>] kmemleak_init+0x19d/0x1e9 [<c1cbf771>] start_kernel+0x346/0x3ec [<c1cbf1b4>] ? loglevel+0x18/0x18 [<c1cbf0aa>] i386_start_kernel+0xaa/0xb0 After a bit of debugging I tracked the object 0xf840000 (and others) down to the cgroup code. The change from allocating base with kmalloc to alloc_page() has the base not calling kmemleak_alloc() which adds the pointer to the object_tree_root, but kmemleak_not_leak() adds it to the crt_early_log[] table. On kmemleak_init(), the entry is found in the early_log[] but not the object_tree_root, and this error message is displayed. If alloc_page() fails then it defaults back to vmalloc() which still uses the kmemleak_alloc() which makes us still need the kmemleak_not_leak() call. The solution is to call the kmemleak_alloc() directly if the alloc_page() succeeds. Reviewed-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Ingo Molnar <mingo@elte.hu> wrote: > The patch below addresses these concerns, serializes the output, tidies up the > printout, resulting in this new output: There's one bug remaining that my patch does not address: the vCPUs are not printed in order: # vCPU #0's dump: # vCPU #2's dump: # vCPU torvalds#24's dump: # vCPU #5's dump: # vCPU torvalds#39's dump: # vCPU torvalds#38's dump: # vCPU torvalds#51's dump: # vCPU torvalds#11's dump: # vCPU torvalds#10's dump: # vCPU torvalds#12's dump: This is undesirable as the order of printout is highly random, so successive dumps are difficult to compare. The patch below serializes the signalling itself. (this is on top of the previous patch) The patch also tweaks the vCPU printout line a bit so that it does not start with '#', which is discarded if such messages are pasted into Git commit messages. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Pekka Enberg <penberg@kernel.org>
If the pte mapping in generic_perform_write() is unmapped between iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the "copied" parameter to ->end_write can be zero. ext4 couldn't cope with it with delayed allocations enabled. This skips the i_disksize enlargement logic if copied is zero and no new data was appeneded to the inode. gdb> bt #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\ 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467 #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 #2 0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\ ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440 #3 generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\ os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482 #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\ xffff88001e26be40) at mm/filemap.c:2600 #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\ zed out>, pos=<value optimized out>) at mm/filemap.c:2632 #6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\ t fs/ext4/file.c:136 #7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \ ppos=0xffff88001e26bf48) at fs/read_write.c:406 #8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\ 000, pos=0xffff88001e26bf48) at fs/read_write.c:435 #9 0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\ 4000) at fs/read_write.c:487 #10 <signal handler called> #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ () #12 0x0000000000000000 in ?? () gdb> print offset $22 = 0xffffffffffffffff gdb> print idx $23 = 0xffffffff gdb> print inode->i_blkbits $24 = 0xc gdb> up #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 2512 if (ext4_da_should_update_i_disksize(page, end)) { gdb> print start $25 = 0x0 gdb> print end $26 = 0xffffffffffffffff gdb> print pos $27 = 0x108000 gdb> print new_i_size $28 = 0x108000 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize $29 = 0xd9000 gdb> down 2467 for (i = 0; i < idx; i++) gdb> print i $30 = 0xd44acbee This is 100% reproducible with some autonuma development code tuned in a very aggressive manner (not normal way even for knumad) which does "exotic" changes to the ptes. It wouldn't normally trigger but I don't see why it can't happen normally if the page is added to swap cache in between the two faults leading to "copied" being zero (which then hangs in ext4). So it should be fixed. Especially possible with lumpy reclaim (albeit disabled if compaction is enabled) as that would ignore the young bits in the ptes. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
Overly indented code should be refactored. Suggest refactoring excessive indentation of of if/else/for/do/while/switch statements. For example: $ cat t.c #include <stdio.h> #include <stdlib.h> int main(int argc, char **argv) { if (1) if (2) if (3) if (4) if (5) if (6) if (7) if (8) ; return 0; } $ ./scripts/checkpatch.pl -f t.c WARNING: Too many leading tabs - consider code refactoring #12: FILE: t.c:12: + if (6) WARNING: Too many leading tabs - consider code refactoring #13: FILE: t.c:13: + if (7) WARNING: Too many leading tabs - consider code refactoring #14: FILE: t.c:14: + if (8) total: 0 errors, 3 warnings, 17 lines checked t.c has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 torvalds#6 [d72d3cb4] isolate_migratepages at c030b15a torvalds#7 [d72d3d14] zone_watermark_ok at c02d26cb torvalds#8 [d72d3d2c] compact_zone at c030b8d torvalds#9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 torvalds#6 [d72d3cb4] isolate_migratepages at c030b15a torvalds#7 [d72d3d14] zone_watermark_ok at c02d26cb torvalds#8 [d72d3d2c] compact_zone at c030b8d torvalds#9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BugLink: http://bugs.launchpad.net/bugs/907778 commit ea51d13 upstream. If the pte mapping in generic_perform_write() is unmapped between iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the "copied" parameter to ->end_write can be zero. ext4 couldn't cope with it with delayed allocations enabled. This skips the i_disksize enlargement logic if copied is zero and no new data was appeneded to the inode. gdb> bt #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\ 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467 #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 #2 0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\ ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440 #3 generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\ os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482 #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\ xffff88001e26be40) at mm/filemap.c:2600 #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\ zed out>, pos=<value optimized out>) at mm/filemap.c:2632 torvalds#6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\ t fs/ext4/file.c:136 torvalds#7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \ ppos=0xffff88001e26bf48) at fs/read_write.c:406 torvalds#8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\ 000, pos=0xffff88001e26bf48) at fs/read_write.c:435 torvalds#9 0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\ 4000) at fs/read_write.c:487 torvalds#10 <signal handler called> torvalds#11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ () torvalds#12 0x0000000000000000 in ?? () gdb> print offset $22 = 0xffffffffffffffff gdb> print idx $23 = 0xffffffff gdb> print inode->i_blkbits $24 = 0xc gdb> up #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\ xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512 2512 if (ext4_da_should_update_i_disksize(page, end)) { gdb> print start $25 = 0x0 gdb> print end $26 = 0xffffffffffffffff gdb> print pos $27 = 0x108000 gdb> print new_i_size $28 = 0x108000 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize $29 = 0xd9000 gdb> down 2467 for (i = 0; i < idx; i++) gdb> print i $30 = 0xd44acbee This is 100% reproducible with some autonuma development code tuned in a very aggressive manner (not normal way even for knumad) which does "exotic" changes to the ptes. It wouldn't normally trigger but I don't see why it can't happen normally if the page is added to swap cache in between the two faults leading to "copied" being zero (which then hangs in ext4). So it should be fixed. Especially possible with lumpy reclaim (albeit disabled if compaction is enabled) as that would ignore the young bits in the ptes. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com>
…S block during isolation for migration BugLink: http://bugs.launchpad.net/bugs/931719 commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 torvalds#6 [d72d3cb4] isolate_migratepages at c030b15a torvalds#7 [d72d3d14] zone_watermark_ok at c02d26cb torvalds#8 [d72d3d2c] compact_zone at c030b8d torvalds#9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…S block during isolation for migration commit 0bf380b upstream. When isolating for migration, migration starts at the start of a zone which is not necessarily pageblock aligned. Further, it stops isolating when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally not aligned. This allows isolate_migratepages() to call pfn_to_page() on an invalid PFN which can result in a crash. This was originally reported against a 3.0-based kernel with the following trace in a crash dump. PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s" #0 [d72d3ad0] crash_kexec at c028cfdb #1 [d72d3b24] oops_end at c05c5322 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60 #3 [d72d3bec] bad_area at c0227fb6 #4 [d72d3c00] do_page_fault at c05c72e #5 [d72d3c80] error_code (via page_fault) at c05c47a4 EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000 DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50 CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002 #6 [d72d3cb4] isolate_migratepages at c030b15a #7 [d72d3d14] zone_watermark_ok at c02d26cb #8 [d72d3d2c] compact_zone at c030b8d #9 [d72d3d68] compact_zone_order at c030bba1 torvalds#10 [d72d3db4] try_to_compact_pages at c030bc84 torvalds#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7 torvalds#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7 torvalds#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97 torvalds#14 [d72d3eb8] alloc_pages_vma at c030a845 torvalds#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb torvalds#16 [d72d3f00] handle_mm_fault at c02f36c6 torvalds#17 [d72d3f30] do_page_fault at c05c70ed torvalds#18 [d72d3fb] error_code (via page_fault) at c05c47a4 EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431 DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788 SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50 CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202 It was also reported by Herbert van den Bergh against 3.1-based kernel with the following snippet from the console log. BUG: unable to handle kernel paging request at 01c00008 IP: [<c0522399>] isolate_migratepages+0x119/0x390 *pdpt = 000000002f7ce001 *pde = 0000000000000000 It is expected that it also affects 3.2.x and current mainline. The problem is that pfn_valid is only called on the first PFN being checked and that PFN is not necessarily aligned. Lets say we have a case like this H = MAX_ORDER_NR_PAGES boundary | = pageblock boundary m = cc->migrate_pfn f = cc->free_pfn o = memory hole H------|------H------|----m-Hoooooo|ooooooH-f----|------H The migrate_pfn is just below a memory hole and the free scanner is beyond the hole. When isolate_migratepages started, it scans from migrate_pfn to migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks pfn_valid() on the first PFN but then scans into the hole where there are not necessarily valid struct pages. This patch ensures that isolate_migratepages calls pfn_valid when necessary. Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BugLink: https://bugs.launchpad.net/bugs/2106770 [ Upstream commit cb358ff ] syzbot presented an use-after-free report [0] regarding ipvlan and linkwatch. ipvlan does not hold a refcnt of the lower device unlike vlan and macvlan. If the linkwatch work is triggered for the ipvlan dev, the lower dev might have already been freed, resulting in UAF of ipvlan->phy_dev in ipvlan_get_iflink(). We can delay the lower dev unregistration like vlan and macvlan by holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and releasing it in dev->priv_destructor(). Jakub pointed out calling .ndo_XXX after unregister_netdevice() has returned is error prone and suggested [1] addressing this UAF in the core by taking commit 750e516 ("net: avoid potential UAF in default_operstate()") further. Let's assume unregistering devices DOWN and use RCU protection in default_operstate() not to race with the device unregistration. [0]: BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944 CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 torvalds#12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47 Hardware name: linux,dummy-virt (DT) Workqueue: events_unbound linkwatch_event Call trace: show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C) __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x16c/0x6f0 mm/kasan/report.c:489 kasan_report+0xc0/0x120 mm/kasan/report.c:602 __asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380 ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 dev_get_iflink+0x7c/0xd8 net/core/dev.c:674 default_operstate net/core/link_watch.c:45 [inline] rfc2863_policy+0x144/0x360 net/core/link_watch.c:72 linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175 __linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239 linkwatch_event+0x64/0xa8 net/core/link_watch.c:282 process_one_work+0x700/0x1398 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391 kthread+0x2b0/0x360 kernel/kthread.c:389 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862 Allocated by task 9303: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __do_kmalloc_node mm/slub.c:4283 [inline] __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289 __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650 alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209 rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595 rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771 __rtnl_newlink net/core/rtnetlink.c:3896 [inline] rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x2ec/0x438 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __arm64_sys_sendto+0xe4/0x110 net/socket.c:2200 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 Freed by task 10200: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kfree+0x140/0x420 mm/slub.c:4746 kvfree+0x4c/0x68 mm/util.c:693 netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034 device_release+0x98/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x2b0/0x438 lib/kobject.c:737 netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924 rtnl_unlock net/core/rtnetlink.c:152 [inline] rtnl_net_unlock net/core/rtnetlink.c:209 [inline] rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] ____sys_sendmsg+0x410/0x708 net/socket.c:2583 ___sys_sendmsg+0x178/0x1d8 net/socket.c:2637 __sys_sendmsg net/socket.c:2669 [inline] __do_sys_sendmsg net/socket.c:2674 [inline] __se_sys_sendmsg net/socket.c:2672 [inline] __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 The buggy address belongs to the object at ffff0000d768c000 which belongs to the cache kmalloc-cg-4k of size 4096 The buggy address is located 224 bytes inside of freed 4096-byte region [ffff0000d768c000, ffff0000d768d000) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff0000c77ef981 flags: 0xbfffe0000000040(head|node=0|zone=2|lastcpupid=0x1ffff) page_type: f5(slab) raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 8c55fac ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> CVE-2025-21652 Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com> Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
Symbolize stack traces by creating a live machine. Add this functionality to dump_stack and switch dump_stack users to use it. Switch TUI to use it. Add stack traces to the child test function which can be useful to diagnose blocked code. Example output: ``` $ perf test -vv PERF_RECORD_ ... 7: PERF_RECORD_* events & perf_sample fields: 7: PERF_RECORD_* events & perf_sample fields : Running (1 active) ^C Signal (2) while running tests. Terminating tests with the same signal Internal test harness failure. Completing any started tests: : 7: PERF_RECORD_* events & perf_sample fields: ---- unexpected signal (2) ---- #0 0x5628ad5570a3 in child_test_sig_handler builtin-test.c:0 #1 0x7f561de49df0 in __restore_rt libc_sigaction.c:0 #2 0x7f561de99687 in __internal_syscall_cancel cancellation.c:64 #3 0x7f561dee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 #4 0x7f561def1393 in __nanosleep nanosleep.c:26 #5 0x7f561df02d68 in __sleep sleep.c:55 torvalds#6 0x5628ad5679ab in test__PERF_RECORD perf-record.c:0 torvalds#7 0x5628ad556fb0 in run_test_child builtin-test.c:0 torvalds#8 0x5628ad4f318d in start_command run-command.c:127 torvalds#9 0x5628ad557ef3 in __cmd_test builtin-test.c:0 torvalds#10 0x5628ad5585bf in cmd_test ??:0 torvalds#11 0x5628ad4e5bb0 in run_builtin perf.c:0 torvalds#12 0x5628ad4e5ecb in handle_internal_command perf.c:0 torvalds#13 0x5628ad461383 in main ??:0 torvalds#14 0x7f561de33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#15 0x7f561de33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#16 0x5628ad4619d1 in _start ??:0 ---- unexpected signal (2) ---- #0 0x5628ad5570a3 in child_test_sig_handler builtin-test.c:0 #1 0x7f561de49df0 in __restore_rt libc_sigaction.c:0 #2 0x7f561dea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45 #3 0x7f561de49fd9 in __GI___sigprocmask sigprocmask.c:26 #4 0x7f561df2601b in __longjmp_chk longjmp.c:36 #5 0x5628ad5570c0 in print_test_result.isra.0 builtin-test.c:0 torvalds#6 0x7f561de49df0 in __restore_rt libc_sigaction.c:0 torvalds#7 0x7f561de99687 in __internal_syscall_cancel cancellation.c:64 torvalds#8 0x7f561dee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 torvalds#9 0x7f561def1393 in __nanosleep nanosleep.c:26 torvalds#10 0x7f561df02d68 in __sleep sleep.c:55 torvalds#11 0x5628ad5679ab in test__PERF_RECORD perf-record.c:0 torvalds#12 0x5628ad556fb0 in run_test_child builtin-test.c:0 torvalds#13 0x5628ad4f318d in start_command run-command.c:127 torvalds#14 0x5628ad557ef3 in __cmd_test builtin-test.c:0 torvalds#15 0x5628ad5585bf in cmd_test ??:0 torvalds#16 0x5628ad4e5bb0 in run_builtin perf.c:0 torvalds#17 0x5628ad4e5ecb in handle_internal_command perf.c:0 torvalds#18 0x5628ad461383 in main ??:0 torvalds#19 0x7f561de33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#20 0x7f561de33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#21 0x5628ad4619d1 in _start ??:0 7: PERF_RECORD_* events & perf_sample fields : Skip (permissions) ``` Signed-off-by: Ian Rogers <irogers@google.com>
In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty torvalds#12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Fixes: 93a0fc4 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty torvalds#12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Another small issue is that timer_delete_sync is called unconditionally in appletb_kbd_remove(), fix this by checking for a valid kbd->backlight_dev before calling timer_delete_sync. Fixes: 93a0fc4 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Without the change `perf `hangs up on charaster devices. On my system it's enough to run system-wide sampler for a few seconds to get the hangup: $ perf record -a -g --call-graph=dwarf $ perf report # hung `strace` shows that hangup happens on reading on a character device `/dev/dri/renderD128` $ strace -y -f -p 2780484 strace: Process 2780484 attached pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached It's call trace descends into `elfutils`: $ gdb -p 2780484 (gdb) bt #0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0) at ../sysdeps/unix/sysv/linux/pread64.c:25 #1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1 #2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 #5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0) at util/dso.h:537 torvalds#6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114 torvalds#7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242 torvalds#8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 torvalds#9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 torvalds#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 torvalds#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1 torvalds#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0, thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127, best_effort=best_effort@entry=false) at util/thread.h:152 torvalds#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0, sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939 torvalds#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2920 torvalds#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440, sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true) at util/machine.c:2970 torvalds#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440, sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198 torvalds#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0, evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127 torvalds#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127, arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255 torvalds#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540, evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334 torvalds#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0, file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367 torvalds#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245 torvalds#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324 torvalds#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224, file_path=0x105ffbf0 "perf.data") at util/session.c:1419 torvalds#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0, --Type <RET> for more, q to quit, c to continue without paging-- quit prog=prog@entry=0x7fff9df81220) at util/session.c:2132 torvalds#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220) at util/session.c:2181 torvalds#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226 torvalds#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390 torvalds#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076 torvalds#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827 torvalds#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:351 torvalds#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404 torvalds#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448 torvalds#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556 The hangup happens because nothing in` perf` or `elfutils` checks if a mapped file is easily readable. The change conservatively skips all non-regular files. Signed-off-by: Sergei Trofimovich <slyich@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Symbolize stack traces by creating a live machine. Add this functionality to dump_stack and switch dump_stack users to use it. Switch TUI to use it. Add stack traces to the child test function which can be useful to diagnose blocked code. Example output: ``` $ perf test -vv PERF_RECORD_ ... 7: PERF_RECORD_* events & perf_sample fields: 7: PERF_RECORD_* events & perf_sample fields : Running (1 active) ^C Signal (2) while running tests. Terminating tests with the same signal Internal test harness failure. Completing any started tests: : 7: PERF_RECORD_* events & perf_sample fields: ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 #3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 #4 0x7fc12fef1393 in __nanosleep nanosleep.c:26 #5 0x7fc12ff02d68 in __sleep sleep.c:55 torvalds#6 0x55788c63196b in test__PERF_RECORD perf-record.c:0 torvalds#7 0x55788c620fb0 in run_test_child builtin-test.c:0 torvalds#8 0x55788c5bd18d in start_command run-command.c:127 torvalds#9 0x55788c621ef3 in __cmd_test builtin-test.c:0 torvalds#10 0x55788c6225bf in cmd_test ??:0 torvalds#11 0x55788c5afbd0 in run_builtin perf.c:0 torvalds#12 0x55788c5afeeb in handle_internal_command perf.c:0 torvalds#13 0x55788c52b383 in main ??:0 torvalds#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#16 0x55788c52b9d1 in _start ??:0 ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45 #3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26 #4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36 #5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0 torvalds#6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 torvalds#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 torvalds#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 torvalds#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26 torvalds#10 0x7fc12ff02d68 in __sleep sleep.c:55 torvalds#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0 torvalds#12 0x55788c620fb0 in run_test_child builtin-test.c:0 torvalds#13 0x55788c5bd18d in start_command run-command.c:127 torvalds#14 0x55788c621ef3 in __cmd_test builtin-test.c:0 torvalds#15 0x55788c6225bf in cmd_test ??:0 torvalds#16 0x55788c5afbd0 in run_builtin perf.c:0 torvalds#17 0x55788c5afeeb in handle_internal_command perf.c:0 torvalds#18 0x55788c52b383 in main ??:0 torvalds#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#21 0x55788c52b9d1 in _start ??:0 7: PERF_RECORD_* events & perf_sample fields : Skip (permissions) ``` Signed-off-by: Ian Rogers <irogers@google.com>
The M100PFSEVP can boot from the SD-Card or the eMMC depending on the level of GPIO torvalds#12. We re-introduce a device tree source for both cases, "mpfs-m100pfsevp-sdcard.dts" and "mpfs-m100pfsevp-sdcard.dts". These include the common "mpfs-m100pfsevp.dtsi". Signed-off-by: Wolfgang Grandegger <wg@aries-embedded.de> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Symbolize stack traces by creating a live machine. Add this functionality to dump_stack and switch dump_stack users to use it. Switch TUI to use it. Add stack traces to the child test function which can be useful to diagnose blocked code. Example output: ``` $ perf test -vv PERF_RECORD_ ... 7: PERF_RECORD_* events & perf_sample fields: 7: PERF_RECORD_* events & perf_sample fields : Running (1 active) ^C Signal (2) while running tests. Terminating tests with the same signal Internal test harness failure. Completing any started tests: : 7: PERF_RECORD_* events & perf_sample fields: ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 #3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 #4 0x7fc12fef1393 in __nanosleep nanosleep.c:26 #5 0x7fc12ff02d68 in __sleep sleep.c:55 torvalds#6 0x55788c63196b in test__PERF_RECORD perf-record.c:0 torvalds#7 0x55788c620fb0 in run_test_child builtin-test.c:0 torvalds#8 0x55788c5bd18d in start_command run-command.c:127 torvalds#9 0x55788c621ef3 in __cmd_test builtin-test.c:0 torvalds#10 0x55788c6225bf in cmd_test ??:0 torvalds#11 0x55788c5afbd0 in run_builtin perf.c:0 torvalds#12 0x55788c5afeeb in handle_internal_command perf.c:0 torvalds#13 0x55788c52b383 in main ??:0 torvalds#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#16 0x55788c52b9d1 in _start ??:0 ---- unexpected signal (2) ---- #0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0 #1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 #2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45 #3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26 #4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36 #5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0 torvalds#6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0 torvalds#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64 torvalds#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72 torvalds#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26 torvalds#10 0x7fc12ff02d68 in __sleep sleep.c:55 torvalds#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0 torvalds#12 0x55788c620fb0 in run_test_child builtin-test.c:0 torvalds#13 0x55788c5bd18d in start_command run-command.c:127 torvalds#14 0x55788c621ef3 in __cmd_test builtin-test.c:0 torvalds#15 0x55788c6225bf in cmd_test ??:0 torvalds#16 0x55788c5afbd0 in run_builtin perf.c:0 torvalds#17 0x55788c5afeeb in handle_internal_command perf.c:0 torvalds#18 0x55788c52b383 in main ??:0 torvalds#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74 torvalds#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 torvalds#21 0x55788c52b9d1 in _start ??:0 7: PERF_RECORD_* events & perf_sample fields : Skip (permissions) ``` Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624210500.2121303-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
commit 54ec8b0 upstream. echo_skb_max should define the supported upper limit of echo_skb[] allocated inside the netdevice's priv. The corresponding size value provided by this driver to alloc_candev() is KVASER_PCIEFD_CAN_TX_MAX_COUNT which is 17. But later echo_skb_max is rounded up to the nearest power of two (for the max case, that would be 32) and the tx/ack indices calculated further during tx/rx may exceed the upper array boundary. Kasan reported this for the ack case inside kvaser_pciefd_handle_ack_packet(), though the xmit function has actually caught the same thing earlier. BUG: KASAN: slab-out-of-bounds in kvaser_pciefd_handle_ack_packet+0x2d7/0x92a drivers/net/can/kvaser_pciefd.c:1528 Read of size 8 at addr ffff888105e4f078 by task swapper/4/0 CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.15.0 torvalds#12 PREEMPT(voluntary) Call Trace: <IRQ> dump_stack_lvl lib/dump_stack.c:122 print_report mm/kasan/report.c:521 kasan_report mm/kasan/report.c:634 kvaser_pciefd_handle_ack_packet drivers/net/can/kvaser_pciefd.c:1528 kvaser_pciefd_read_packet drivers/net/can/kvaser_pciefd.c:1605 kvaser_pciefd_read_buffer drivers/net/can/kvaser_pciefd.c:1656 kvaser_pciefd_receive_irq drivers/net/can/kvaser_pciefd.c:1684 kvaser_pciefd_irq_handler drivers/net/can/kvaser_pciefd.c:1733 __handle_irq_event_percpu kernel/irq/handle.c:158 handle_irq_event kernel/irq/handle.c:210 handle_edge_irq kernel/irq/chip.c:833 __common_interrupt arch/x86/kernel/irq.c:296 common_interrupt arch/x86/kernel/irq.c:286 </IRQ> Tx max count definitely matters for kvaser_pciefd_tx_avail(), but for seq numbers' generation that's not the case - we're free to calculate them as would be more convenient, not taking tx max count into account. The only downside is that the size of echo_skb[] should correspond to the max seq number (not tx max count), so in some situations a bit more memory would be consumed than could be. Thus make the size of the underlying echo_skb[] sufficient for the rounded max tx value. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 8256e0c ("can: kvaser_pciefd: Fix echo_skb race") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20250528192713.63894-1-pchelkin@ispras.ru Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eedf3e3 ] ACPICA commit 1c28da2242783579d59767617121035dafba18c3 This was originally done in NetBSD: NetBSD/src@b69d1ac and is the correct alternative to the smattering of `memcpy`s I previously contributed to this repository. This also sidesteps the newly strict checks added in UBSAN: llvm/llvm-project@7926744 Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e #1.2 0x000021982bc4af3c in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x41f3c #1.1 0x000021982bc4af3c in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x41f3c #1 0x000021982bc4af3c in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:395 <libclang_rt.asan.so>+0x41f3c #2 0x000021982bc4bb6f in handletype_mismatch_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:137 <libclang_rt.asan.so>+0x42b6f #3 0x000021982bc4b723 in __ubsan_handle_type_mismatch_v1 compiler-rt/lib/ubsan/ubsan_handlers.cpp:142 <libclang_rt.asan.so>+0x42723 #4 0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e #5 0x000021afcfdf2089 in acpi_rs_convert_aml_to_resource(struct acpi_resource*, union aml_resource*, struct acpi_rsconvert_info*) ../../third_party/acpica/source/components/resources/rsmisc.c:355 <platform-bus-x86.so>+0x6b2089 #6 0x000021afcfded169 in acpi_rs_convert_aml_to_resources(u8*, u32, u32, u8, void**) ../../third_party/acpica/source/components/resources/rslist.c:137 <platform-bus-x86.so>+0x6ad169 #7 0x000021afcfe2d24a in acpi_ut_walk_aml_resources(struct acpi_walk_state*, u8*, acpi_size, acpi_walk_aml_callback, void**) ../../third_party/acpica/source/components/utilities/utresrc.c:237 <platform-bus-x86.so>+0x6ed24a #8 0x000021afcfde66b7 in acpi_rs_create_resource_list(union acpi_operand_object*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rscreate.c:199 <platform-bus-x86.so>+0x6a66b7 #9 0x000021afcfdf6979 in acpi_rs_get_method_data(acpi_handle, const char*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rsutils.c:770 <platform-bus-x86.so>+0x6b6979 torvalds#10 0x000021afcfdf708f in acpi_walk_resources(acpi_handle, char*, acpi_walk_resource_callback, void*) ../../third_party/acpica/source/components/resources/rsxface.c:731 <platform-bus-x86.so>+0x6b708f torvalds#11 0x000021afcfa95dcf in acpi::acpi_impl::walk_resources(acpi::acpi_impl*, acpi_handle, const char*, acpi::Acpi::resources_callable) ../../src/devices/board/lib/acpi/acpi-impl.cc:41 <platform-bus-x86.so>+0x355dcf torvalds#12 0x000021afcfaa8278 in acpi::device_builder::gather_resources(acpi::device_builder*, acpi::Acpi*, fidl::any_arena&, acpi::Manager*, acpi::device_builder::gather_resources_callback) ../../src/devices/board/lib/acpi/device-builder.cc:84 <platform-bus-x86.so>+0x368278 torvalds#13 0x000021afcfbddb87 in acpi::Manager::configure_discovered_devices(acpi::Manager*) ../../src/devices/board/lib/acpi/manager.cc:75 <platform-bus-x86.so>+0x49db87 torvalds#14 0x000021afcf99091d in publish_acpi_devices(acpi::Manager*, zx_device_t*, zx_device_t*) ../../src/devices/board/drivers/x86/acpi-nswalk.cc:95 <platform-bus-x86.so>+0x25091d torvalds#15 0x000021afcf9c1d4e in x86::X86::do_init(x86::X86*) ../../src/devices/board/drivers/x86/x86.cc:60 <platform-bus-x86.so>+0x281d4e torvalds#16 0x000021afcf9e33ad in λ(x86::X86::ddk_init::(anon class)*) ../../src/devices/board/drivers/x86/x86.cc:77 <platform-bus-x86.so>+0x2a33ad torvalds#17 0x000021afcf9e313e in fit::internal::target<(lambda at../../src/devices/board/drivers/x86/x86.cc:76:19), false, false, std::__2::allocator<std::byte>, void>::invoke(void*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:183 <platform-bus-x86.so>+0x2a313e torvalds#18 0x000021afcfbab4c7 in fit::internal::function_base<16UL, false, void(), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <platform-bus-x86.so>+0x46b4c7 torvalds#19 0x000021afcfbab342 in fit::function_impl<16UL, false, void(), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/function.h:315 <platform-bus-x86.so>+0x46b342 torvalds#20 0x000021afcfcd98c3 in async::internal::retained_task::Handler(async_dispatcher_t*, async_task_t*, zx_status_t) ../../sdk/lib/async/task.cc:24 <platform-bus-x86.so>+0x5998c3 torvalds#21 0x00002290f9924616 in λ(const driver_runtime::Dispatcher::post_task::(anon class)*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/dispatcher.cc:789 <libdriver_runtime.so>+0x10a616 torvalds#22 0x00002290f9924323 in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:788:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int>::invoke(void*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0x10a323 torvalds#23 0x00002290f9904b76 in fit::internal::function_base<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xeab76 torvalds#24 0x00002290f9904831 in fit::callback_impl<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::operator()(fit::callback_impl<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/function.h:471 <libdriver_runtime.so>+0xea831 torvalds#25 0x00002290f98d5adc in driver_runtime::callback_request::Call(driver_runtime::callback_request*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/callback_request.h:74 <libdriver_runtime.so>+0xbbadc torvalds#26 0x00002290f98e1e58 in driver_runtime::Dispatcher::dispatch_callback(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >) ../../src/devices/bin/driver_runtime/dispatcher.cc:1248 <libdriver_runtime.so>+0xc7e58 torvalds#27 0x00002290f98e4159 in driver_runtime::Dispatcher::dispatch_callbacks(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:1308 <libdriver_runtime.so>+0xca159 torvalds#28 0x00002290f9918414 in λ(const driver_runtime::Dispatcher::create_with_adder::(anon class)*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:353 <libdriver_runtime.so>+0xfe414 torvalds#29 0x00002290f991812d in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:351:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>>::invoke(void*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0xfe12d torvalds#30 0x00002290f9906fc7 in fit::internal::function_base<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xecfc7 torvalds#31 0x00002290f9906c66 in fit::function_impl<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/function.h:315 <libdriver_runtime.so>+0xecc66 torvalds#32 0x00002290f98e73d9 in driver_runtime::Dispatcher::event_waiter::invoke_callback(driver_runtime::Dispatcher::event_waiter*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.h:543 <libdriver_runtime.so>+0xcd3d9 torvalds#33 0x00002290f98e700d in driver_runtime::Dispatcher::event_waiter::handle_event(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/dispatcher.cc:1442 <libdriver_runtime.so>+0xcd00d torvalds#34 0x00002290f9918983 in async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event(async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>*, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/async_loop_owned_event_handler.h:59 <libdriver_runtime.so>+0xfe983 torvalds#35 0x00002290f9918b9e in async::wait_method<async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>, &async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event>::call_handler(async_dispatcher_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async/include/lib/async/cpp/wait.h:201 <libdriver_runtime.so>+0xfeb9e torvalds#36 0x00002290f99bf509 in async_loop_dispatch_wait(async_loop_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async-loop/loop.c:394 <libdriver_runtime.so>+0x1a5509 torvalds#37 0x00002290f99b9958 in async_loop_run_once(async_loop_t*, zx_time_t) ../../sdk/lib/async-loop/loop.c:343 <libdriver_runtime.so>+0x19f958 torvalds#38 0x00002290f99b9247 in async_loop_run(async_loop_t*, zx_time_t, _Bool) ../../sdk/lib/async-loop/loop.c:301 <libdriver_runtime.so>+0x19f247 torvalds#39 0x00002290f99ba962 in async_loop_run_thread(void*) ../../sdk/lib/async-loop/loop.c:860 <libdriver_runtime.so>+0x1a0962 torvalds#40 0x000041afd176ef30 in start_c11(void*) ../../zircon/third_party/ulib/musl/pthread/pthread_create.c:63 <libc.so>+0x84f30 torvalds#41 0x000041afd18a448d in thread_trampoline(uintptr_t, uintptr_t) ../../zircon/system/ulib/runtime/thread.cc:100 <libc.so>+0x1ba48d Link: acpica/acpica@1c28da22 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/4664267.LvFx2qVVIh@rjwysocki.net Signed-off-by: Tamir Duberstein <tamird@gmail.com> [ rjw: Pick up the tag from Tamir ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Calling perf top with branch filters enabled on Intel CPU's with branch counters logging (A.K.A LBR event logging [1]) support results in a segfault. $ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter ... Thread 27 "perf" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffafff76c0 (LWP 949003)] perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 653 *width = env->cpu_pmu_caps ? env->br_cntr_width : (gdb) bt #0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653 #1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345 #2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389 #3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422 #4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850 #5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737 torvalds#6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359 torvalds#7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845 torvalds#8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211 torvalds#9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245 torvalds#10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324 torvalds#11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342 torvalds#12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120 torvalds#13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448 torvalds#14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 The cause is that perf_env__find_br_cntr_info tries to access a null pointer pmu_caps in the perf_env struct. A similar issue exists for homogeneous core systems which use the cpu_pmu_caps structure. Fix this by populating cpu_pmu_caps and pmu_caps structures with values from sysfs when calling perf top with branch stack sampling enabled. [1], LBR event logging introduced here: https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Here is the first fix for torvalds#12, which fix the limited cpufreq for schedutil. But it still tend to stay at max cpufreq.
In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty torvalds#12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Another small issue is that timer_delete_sync is called unconditionally in appletb_kbd_remove(), fix this by checking for a valid kbd->backlight_dev before calling timer_delete_sync. Fixes: 93a0fc4 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
commit 54ec8b0 upstream. echo_skb_max should define the supported upper limit of echo_skb[] allocated inside the netdevice's priv. The corresponding size value provided by this driver to alloc_candev() is KVASER_PCIEFD_CAN_TX_MAX_COUNT which is 17. But later echo_skb_max is rounded up to the nearest power of two (for the max case, that would be 32) and the tx/ack indices calculated further during tx/rx may exceed the upper array boundary. Kasan reported this for the ack case inside kvaser_pciefd_handle_ack_packet(), though the xmit function has actually caught the same thing earlier. BUG: KASAN: slab-out-of-bounds in kvaser_pciefd_handle_ack_packet+0x2d7/0x92a drivers/net/can/kvaser_pciefd.c:1528 Read of size 8 at addr ffff888105e4f078 by task swapper/4/0 CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.15.0 torvalds#12 PREEMPT(voluntary) Call Trace: <IRQ> dump_stack_lvl lib/dump_stack.c:122 print_report mm/kasan/report.c:521 kasan_report mm/kasan/report.c:634 kvaser_pciefd_handle_ack_packet drivers/net/can/kvaser_pciefd.c:1528 kvaser_pciefd_read_packet drivers/net/can/kvaser_pciefd.c:1605 kvaser_pciefd_read_buffer drivers/net/can/kvaser_pciefd.c:1656 kvaser_pciefd_receive_irq drivers/net/can/kvaser_pciefd.c:1684 kvaser_pciefd_irq_handler drivers/net/can/kvaser_pciefd.c:1733 __handle_irq_event_percpu kernel/irq/handle.c:158 handle_irq_event kernel/irq/handle.c:210 handle_edge_irq kernel/irq/chip.c:833 __common_interrupt arch/x86/kernel/irq.c:296 common_interrupt arch/x86/kernel/irq.c:286 </IRQ> Tx max count definitely matters for kvaser_pciefd_tx_avail(), but for seq numbers' generation that's not the case - we're free to calculate them as would be more convenient, not taking tx max count into account. The only downside is that the size of echo_skb[] should correspond to the max seq number (not tx max count), so in some situations a bit more memory would be consumed than could be. Thus make the size of the underlying echo_skb[] sufficient for the rounded max tx value. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 8256e0c ("can: kvaser_pciefd: Fix echo_skb race") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20250528192713.63894-1-pchelkin@ispras.ru Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eedf3e3 ] ACPICA commit 1c28da2242783579d59767617121035dafba18c3 This was originally done in NetBSD: NetBSD/src@b69d1ac and is the correct alternative to the smattering of `memcpy`s I previously contributed to this repository. This also sidesteps the newly strict checks added in UBSAN: llvm/llvm-project@7926744 Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e #1.2 0x000021982bc4af3c in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x41f3c #1.1 0x000021982bc4af3c in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x41f3c #1 0x000021982bc4af3c in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:395 <libclang_rt.asan.so>+0x41f3c #2 0x000021982bc4bb6f in handletype_mismatch_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:137 <libclang_rt.asan.so>+0x42b6f #3 0x000021982bc4b723 in __ubsan_handle_type_mismatch_v1 compiler-rt/lib/ubsan/ubsan_handlers.cpp:142 <libclang_rt.asan.so>+0x42723 #4 0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e #5 0x000021afcfdf2089 in acpi_rs_convert_aml_to_resource(struct acpi_resource*, union aml_resource*, struct acpi_rsconvert_info*) ../../third_party/acpica/source/components/resources/rsmisc.c:355 <platform-bus-x86.so>+0x6b2089 torvalds#6 0x000021afcfded169 in acpi_rs_convert_aml_to_resources(u8*, u32, u32, u8, void**) ../../third_party/acpica/source/components/resources/rslist.c:137 <platform-bus-x86.so>+0x6ad169 torvalds#7 0x000021afcfe2d24a in acpi_ut_walk_aml_resources(struct acpi_walk_state*, u8*, acpi_size, acpi_walk_aml_callback, void**) ../../third_party/acpica/source/components/utilities/utresrc.c:237 <platform-bus-x86.so>+0x6ed24a torvalds#8 0x000021afcfde66b7 in acpi_rs_create_resource_list(union acpi_operand_object*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rscreate.c:199 <platform-bus-x86.so>+0x6a66b7 torvalds#9 0x000021afcfdf6979 in acpi_rs_get_method_data(acpi_handle, const char*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rsutils.c:770 <platform-bus-x86.so>+0x6b6979 torvalds#10 0x000021afcfdf708f in acpi_walk_resources(acpi_handle, char*, acpi_walk_resource_callback, void*) ../../third_party/acpica/source/components/resources/rsxface.c:731 <platform-bus-x86.so>+0x6b708f torvalds#11 0x000021afcfa95dcf in acpi::acpi_impl::walk_resources(acpi::acpi_impl*, acpi_handle, const char*, acpi::Acpi::resources_callable) ../../src/devices/board/lib/acpi/acpi-impl.cc:41 <platform-bus-x86.so>+0x355dcf torvalds#12 0x000021afcfaa8278 in acpi::device_builder::gather_resources(acpi::device_builder*, acpi::Acpi*, fidl::any_arena&, acpi::Manager*, acpi::device_builder::gather_resources_callback) ../../src/devices/board/lib/acpi/device-builder.cc:84 <platform-bus-x86.so>+0x368278 torvalds#13 0x000021afcfbddb87 in acpi::Manager::configure_discovered_devices(acpi::Manager*) ../../src/devices/board/lib/acpi/manager.cc:75 <platform-bus-x86.so>+0x49db87 torvalds#14 0x000021afcf99091d in publish_acpi_devices(acpi::Manager*, zx_device_t*, zx_device_t*) ../../src/devices/board/drivers/x86/acpi-nswalk.cc:95 <platform-bus-x86.so>+0x25091d torvalds#15 0x000021afcf9c1d4e in x86::X86::do_init(x86::X86*) ../../src/devices/board/drivers/x86/x86.cc:60 <platform-bus-x86.so>+0x281d4e torvalds#16 0x000021afcf9e33ad in λ(x86::X86::ddk_init::(anon class)*) ../../src/devices/board/drivers/x86/x86.cc:77 <platform-bus-x86.so>+0x2a33ad torvalds#17 0x000021afcf9e313e in fit::internal::target<(lambda at../../src/devices/board/drivers/x86/x86.cc:76:19), false, false, std::__2::allocator<std::byte>, void>::invoke(void*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:183 <platform-bus-x86.so>+0x2a313e torvalds#18 0x000021afcfbab4c7 in fit::internal::function_base<16UL, false, void(), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <platform-bus-x86.so>+0x46b4c7 torvalds#19 0x000021afcfbab342 in fit::function_impl<16UL, false, void(), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/function.h:315 <platform-bus-x86.so>+0x46b342 torvalds#20 0x000021afcfcd98c3 in async::internal::retained_task::Handler(async_dispatcher_t*, async_task_t*, zx_status_t) ../../sdk/lib/async/task.cc:24 <platform-bus-x86.so>+0x5998c3 torvalds#21 0x00002290f9924616 in λ(const driver_runtime::Dispatcher::post_task::(anon class)*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/dispatcher.cc:789 <libdriver_runtime.so>+0x10a616 torvalds#22 0x00002290f9924323 in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:788:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int>::invoke(void*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0x10a323 torvalds#23 0x00002290f9904b76 in fit::internal::function_base<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xeab76 torvalds#24 0x00002290f9904831 in fit::callback_impl<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::operator()(fit::callback_impl<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/function.h:471 <libdriver_runtime.so>+0xea831 torvalds#25 0x00002290f98d5adc in driver_runtime::callback_request::Call(driver_runtime::callback_request*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/callback_request.h:74 <libdriver_runtime.so>+0xbbadc torvalds#26 0x00002290f98e1e58 in driver_runtime::Dispatcher::dispatch_callback(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >) ../../src/devices/bin/driver_runtime/dispatcher.cc:1248 <libdriver_runtime.so>+0xc7e58 torvalds#27 0x00002290f98e4159 in driver_runtime::Dispatcher::dispatch_callbacks(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:1308 <libdriver_runtime.so>+0xca159 torvalds#28 0x00002290f9918414 in λ(const driver_runtime::Dispatcher::create_with_adder::(anon class)*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:353 <libdriver_runtime.so>+0xfe414 torvalds#29 0x00002290f991812d in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:351:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>>::invoke(void*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0xfe12d torvalds#30 0x00002290f9906fc7 in fit::internal::function_base<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xecfc7 torvalds#31 0x00002290f9906c66 in fit::function_impl<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/function.h:315 <libdriver_runtime.so>+0xecc66 torvalds#32 0x00002290f98e73d9 in driver_runtime::Dispatcher::event_waiter::invoke_callback(driver_runtime::Dispatcher::event_waiter*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.h:543 <libdriver_runtime.so>+0xcd3d9 torvalds#33 0x00002290f98e700d in driver_runtime::Dispatcher::event_waiter::handle_event(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/dispatcher.cc:1442 <libdriver_runtime.so>+0xcd00d torvalds#34 0x00002290f9918983 in async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event(async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>*, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/async_loop_owned_event_handler.h:59 <libdriver_runtime.so>+0xfe983 torvalds#35 0x00002290f9918b9e in async::wait_method<async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>, &async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event>::call_handler(async_dispatcher_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async/include/lib/async/cpp/wait.h:201 <libdriver_runtime.so>+0xfeb9e torvalds#36 0x00002290f99bf509 in async_loop_dispatch_wait(async_loop_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async-loop/loop.c:394 <libdriver_runtime.so>+0x1a5509 torvalds#37 0x00002290f99b9958 in async_loop_run_once(async_loop_t*, zx_time_t) ../../sdk/lib/async-loop/loop.c:343 <libdriver_runtime.so>+0x19f958 torvalds#38 0x00002290f99b9247 in async_loop_run(async_loop_t*, zx_time_t, _Bool) ../../sdk/lib/async-loop/loop.c:301 <libdriver_runtime.so>+0x19f247 torvalds#39 0x00002290f99ba962 in async_loop_run_thread(void*) ../../sdk/lib/async-loop/loop.c:860 <libdriver_runtime.so>+0x1a0962 torvalds#40 0x000041afd176ef30 in start_c11(void*) ../../zircon/third_party/ulib/musl/pthread/pthread_create.c:63 <libc.so>+0x84f30 torvalds#41 0x000041afd18a448d in thread_trampoline(uintptr_t, uintptr_t) ../../zircon/system/ulib/runtime/thread.cc:100 <libc.so>+0x1ba48d Link: acpica/acpica@1c28da22 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/4664267.LvFx2qVVIh@rjwysocki.net Signed-off-by: Tamir Duberstein <tamird@gmail.com> [ rjw: Pick up the tag from Tamir ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 38224c4 upstream. In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty torvalds#12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Another small issue is that timer_delete_sync is called unconditionally in appletb_kbd_remove(), fix this by checking for a valid kbd->backlight_dev before calling timer_delete_sync. Fixes: 93a0fc4 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38224c4 upstream. In probe appletb_kbd_probe() a "struct appletb_kbd *kbd" is allocated via devm_kzalloc() to store touch bar keyboard related data. Later on if backlight_device_get_by_name() finds a backlight device with name "appletb_backlight" a timer (kbd->inactivity_timer) is setup with appletb_inactivity_timer() and the timer is armed to run after appletb_tb_dim_timeout (60) seconds. A use-after-free is triggered when failure occurs after the timer is armed. This ultimately means probe failure occurs and as a result the "struct appletb_kbd *kbd" which is device managed memory is freed. After 60 seconds the timer will have expired and __run_timers will attempt to access the timer (kbd->inactivity_timer) however the kdb structure has been freed causing a use-after free. [ 71.636938] ================================================================== [ 71.637915] BUG: KASAN: slab-use-after-free in __run_timers+0x7ad/0x890 [ 71.637915] Write of size 8 at addr ffff8881178c5958 by task swapper/1/0 [ 71.637915] [ 71.637915] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.16.0-rc2-00318-g739a6c93cc75-dirty torvalds#12 PREEMPT(voluntary) [ 71.637915] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.637915] Call Trace: [ 71.637915] <IRQ> [ 71.637915] dump_stack_lvl+0x53/0x70 [ 71.637915] print_report+0xce/0x670 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] kasan_report+0xce/0x100 [ 71.637915] ? __run_timers+0x7ad/0x890 [ 71.637915] __run_timers+0x7ad/0x890 [ 71.637915] ? __pfx___run_timers+0x10/0x10 [ 71.637915] ? update_process_times+0xfc/0x190 [ 71.637915] ? __pfx_update_process_times+0x10/0x10 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? _raw_spin_lock_irq+0x80/0xe0 [ 71.637915] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 71.637915] run_timer_softirq+0x141/0x240 [ 71.637915] ? __pfx_run_timer_softirq+0x10/0x10 [ 71.637915] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 71.637915] ? kvm_clock_get_cycles+0x18/0x30 [ 71.637915] ? ktime_get+0x60/0x140 [ 71.637915] handle_softirqs+0x1b8/0x5c0 [ 71.637915] ? __pfx_handle_softirqs+0x10/0x10 [ 71.637915] irq_exit_rcu+0xaf/0xe0 [ 71.637915] sysvec_apic_timer_interrupt+0x6c/0x80 [ 71.637915] </IRQ> [ 71.637915] [ 71.637915] Allocated by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] __kasan_kmalloc+0x8f/0xa0 [ 71.637915] __kmalloc_node_track_caller_noprof+0x195/0x420 [ 71.637915] devm_kmalloc+0x74/0x1e0 [ 71.637915] appletb_kbd_probe+0x37/0x3c0 [ 71.637915] hid_device_probe+0x2d1/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] [ 71.637915] [ 71.637915] Freed by task 39: [ 71.637915] kasan_save_stack+0x33/0x60 [ 71.637915] kasan_save_track+0x14/0x30 [ 71.637915] kasan_save_free_info+0x3b/0x60 [ 71.637915] __kasan_slab_free+0x37/0x50 [ 71.637915] kfree+0xcf/0x360 [ 71.637915] devres_release_group+0x1f8/0x3c0 [ 71.637915] hid_device_probe+0x315/0x680 [ 71.637915] really_probe+0x1c3/0x690 [ 71.637915] __driver_probe_device+0x247/0x300 [ 71.637915] driver_probe_device+0x49/0x210 [...] The root cause of the issue is that the timer is not disarmed on failure paths leading to it remaining active and accessing freed memory. To fix this call timer_delete_sync() to deactivate the timer. Another small issue is that timer_delete_sync is called unconditionally in appletb_kbd_remove(), fix this by checking for a valid kbd->backlight_dev before calling timer_delete_sync. Fixes: 93a0fc4 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar") Cc: stable@vger.kernel.org Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BugLink: https://bugs.launchpad.net/bugs/2111953 [ Upstream commit c7b87ce ] libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 torvalds#6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 torvalds#7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 torvalds#8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 torvalds#9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 torvalds#10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: 5e58fcf ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250122025519.361873-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com> Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
Hi,
Here is what I did to the
CREDITS
file:Sorry for the microscopic commit.
Sincerely,
Jan.