Skip to content

Boot failure with ppc64_guest_defconfig after LLVM commit 7763119c6eb0976e4836f81c9876c49a36d46d73 #2072

Closed
@nathanchance

Description

@nathanchance

I am seeing a boot failure with ppc64_guest_defconfig after llvm/llvm-project@7763119 (which also causes #2070) but it is not fixed with the change that resolves that issue.

$ make -skj"$(nproc)" ARCH=powerpc LLVM=1 mrproper ppc64_guest_defconfig vmlinux

$ qemu-system-ppc64 \
	-display none \
	-nodefaults \
	-cpu power8 \
	-machine pseries \
	-vga none \
	-kernel vmlinux \
	-initrd rootfs.cpio \
	-m 1G \
	-serial mon:stdio
...
[    0.000000][    T0] Linux version 6.14.0-rc3-00012-g2408a807bfc3 (nathan@ax162) (ClangBuiltLinux clang version 21.0.0git (https://github.com/llvm/llvm-project.git 7763119c6eb0976e4836f81c9876c49a36d46d73), ClangBuiltLinux LLD 21.0.0 (https://github.com/llvm/llvm-project.git 7763119c6eb0976e4836f81c9876c49a36d46d73)) #1 SMP Tue Feb 18 12:35:22 MST 2025
...
[    0.000000][    T0] Kernel command line:
[    0.000000][    T0] printk: log buffer data + meta data: 262144 + 917504 = 1179648 bytes
[    0.000000][    T0] Dentry cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
[    0.000000][    T0] Inode-cache hash table entries: 65536 (order: 3, 524288 bytes, linear)
[    0.000000][    T0] Fallback order for Node 0: 0
[    0.000000][    T0] Built 1 zonelists, mobility grouping off.  Total pages: 0
[    0.000000][    T0] Policy zone: Normal
[    0.000000][    T0] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.000000][    T0] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000][    T0] ftrace: allocating 47180 entries in 12 pages
[    0.000000][    T0] ftrace: allocated 12 pages with 2 groups
[    0.000000][    T0] rcu: Hierarchical RCU implementation.
[    0.000000][    T0] rcu:     RCU event tracing is enabled.
[    0.000000][    T0] rcu:     RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
[    0.000000][    T0]  Rude variant of Tasks RCU enabled.
[    0.000000][    T0]  Tracing variant of Tasks RCU enabled.
[    0.000000][    T0] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000][    T0] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000][    T0] RCU Tasks Rude: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=1.
[    0.000000][    T0] RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=1.
[    0.000000][    T0] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16

At the parent change, there is no issue with booting.

[    0.000000][    T0] Linux version 6.14.0-rc3-00012-g2408a807bfc3 (nathan@ax162) (ClangBuiltLinux clang version 21.0.0git (https://github.com/llvm/llvm-project.git f6e3d33c009cada0437c11d3fd1beace74c5dcfa), ClangBuiltLinux LLD 21.0.0 (https://github.com/llvm/llvm-project.git f6e3d33c009cada0437c11d3fd1beace74c5dcfa)) #1 SMP Tue Feb 18 12:33:45 MST 2025
...
[    0.000000][    T0] Kernel command line:
[    0.000000][    T0] printk: log buffer data + meta data: 262144 + 917504 = 1179648 bytes
[    0.000000][    T0] Dentry cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
[    0.000000][    T0] Inode-cache hash table entries: 65536 (order: 3, 524288 bytes, linear)
[    0.000000][    T0] Fallback order for Node 0: 0
[    0.000000][    T0] Built 1 zonelists, mobility grouping on.  Total pages: 16384
[    0.000000][    T0] Policy zone: Normal
[    0.000000][    T0] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.000000][    T0] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000][    T0] ftrace: allocating 47180 entries in 12 pages
[    0.000000][    T0] ftrace: allocated 12 pages with 2 groups
[    0.000000][    T0] rcu: Hierarchical RCU implementation.
[    0.000000][    T0] rcu:     RCU event tracing is enabled.
[    0.000000][    T0] rcu:     RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
[    0.000000][    T0]  Rude variant of Tasks RCU enabled.
[    0.000000][    T0]  Tracing variant of Tasks RCU enabled.
[    0.000000][    T0] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000][    T0] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000][    T0] RCU Tasks Rude: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=1.
[    0.000000][    T0] RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=1.
[    0.000000][    T0] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[    0.000000][    T0] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000233][    T0] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.000899][    T0] clocksource: timebase mult[1f40000] shift[24] registered
[    0.006915][    T0] Console: colour dummy device 80x25
[    0.007805][    T0] printk: legacy console [hvc0] enabled
[    0.007805][    T0] printk: legacy console [hvc0] enabled
[    0.008247][    T0] printk: legacy bootconsole [udbg0] disabled
[    0.008247][    T0] printk: legacy bootconsole [udbg0] disabled
...

Based on a little bit of gdb debugging, it seems like the kernel gets into lib/maple_tree.c via the IRQ subsystem but does not come back. Linking lib/maple_tree.o from a tree built with the good compiler into a tree built with the bad compiler does allow the boot to hobble along further but it still never gets to userspace so that is probably not the only translation unit that has a problem.

Diffing the disassembly from lib/maple_tree.o between the good and bad revision, I see code generation changes in three functions (everything elses seems to be related like different addresses with the bigger code size from these changes):

mab_mas_cp()

@@ -66,15 +66,16 @@ e9 08 ff f8     ld 8, -8(8)
 78 a7 06 20    clrldi  7, 5, 56
 7c 85 23 78    mr  5, 4
 7c 04 40 40    cmplw   4, 8
-41 81 00 08    bt  1, 0xe770 <mab_mas_cp+0x100>
+41 81 00 08    bt  1, 0xe790 <mab_mas_cp+0x100>
 7d 05 43 78    mr  5, 8
-7d 44 38 50    sub 10, 7, 4
-78 88 1f 48    rldic 8, 4, 3, 29
 3b 20 00 00    li 25, 0
+7c e4 38 10    subc    7, 7, 4
+78 88 1f 48    rldic 8, 4, 3, 29
+7d 39 01 94    addze 9, 25
+2c 09 ff ff    cmpwi   9, -1
 39 20 00 00    li 9, 0
-7c 2a 38 40    cmpld   10, 7
-41 81 00 08    bt  1, 0xe78c <mab_mas_cp+0x11c>
-7d 49 53 78    mr  9, 10
+40 82 00 08    bf  2, 0xe7b0 <mab_mas_cp+0x120>
+7c e9 3b 78    mr  9, 7
 7f c7 f3 78    mr  7, 30
 3a c5 00 01    addi 22, 5, 1
 38 a4 ff ff    addi 5, 4, -1

mas_alloc_cyclic()

@@ -82,26 +82,28 @@ f8 7e 00 18     std 3, 24(30)
 e8 7e 00 18    ld 3, 24(30)
 78 64 07 a0    clrldi  4, 3, 62
 28 24 00 02    cmpldi  4, 2
-41 82 00 40    bt  2, 0xbf8 <mas_alloc_cyclic+0x178>
+41 82 00 48    bt  2, 0xc00 <mas_alloc_cyclic+0x180>
 3b 80 00 00    li 28, 0
-48 00 00 3c    b 0xbfc <mas_alloc_cyclic+0x17c>
+48 00 00 44    b 0xc04 <mas_alloc_cyclic+0x184>
 e8 7e 00 08    ld 3, 8(30)
+38 80 00 00    li 4, 0
 f8 7b 00 00    std 3, 0(27)
-38 63 00 01    addi 3, 3, 1
-28 23 00 00    cmpldi  3, 0
+30 63 00 01    addic 3, 3, 1
+7c 84 01 94    addze 4, 4
 f8 7d 00 00    std 3, 0(29)
-40 82 00 14    bf  2, 0xbec <mas_alloc_cyclic+0x16c>
+28 04 00 01    cmplwi  4, 1
+40 82 00 14    bf  2, 0xbf4 <mas_alloc_cyclic+0x174>
 e8 7e 00 00    ld 3, 0(30)
 80 83 00 04    lwz 4, 4(3)
 60 84 08 00    ori 4, 4, 2048
 90 83 00 04    stw 4, 4(3)
 7f c3 f3 78    mr  3, 30
-48 00 00 01    bl 0xbf0 <mas_alloc_cyclic+0x170>
-48 00 00 18    b 0xc0c <mas_alloc_cyclic+0x18c>
+48 00 00 01    bl 0xbf8 <mas_alloc_cyclic+0x178>
+48 00 00 18    b 0xc14 <mas_alloc_cyclic+0x194>
 78 7c f0 82    rldicl 28, 3, 62, 2
 38 80 c0 05    li 4, -16379
 7c 23 20 40    cmpld   3, 4
-41 81 00 08    bt  1, 0xc0c <mas_alloc_cyclic+0x18c>
+41 81 00 08    bt  1, 0xc14 <mas_alloc_cyclic+0x194>
 3b 80 00 00    li 28, 0
 7f 83 07 b4    extsw 3, 28
 38 21 00 70    addi 1, 1, 112

mas_wr_spanning_store()

@@ -75,10 +75,12 @@ e8 7e 00 00     ld 3, 0(30)
 48 00 00 01    bl 0xb080 <mas_wr_spanning_store+0x110>
 60 00 00 00    nop
 e8 61 00 80    ld 3, 128(1)
-38 83 00 01    addi 4, 3, 1
+30 83 00 01    addic 4, 3, 1
+38 60 00 00    li 3, 0
+7c a3 01 94    addze 5, 3
 38 60 ff ff    li 3, -1
-28 24 00 00    cmpldi  4, 0
-41 82 00 0c    bt  2, 0xb0a4 <mas_wr_spanning_store+0x134>
+28 05 00 00    cmplwi  5, 0
+40 82 00 0c    bf  2, 0xb0ac <mas_wr_spanning_store+0x13c>
 7c 83 23 78    mr  3, 4
 f8 81 00 80    std 4, 128(1)
 38 80 ff ff    li 4, -1

I am not really familiar with PowerPC assembly so I am not sure if these are expected transformations or not. I am not sure how to go about getting a smaller reproducer at this point.

Metadata

Metadata

Assignees

No one assigned

    Labels

    [ARCH] powerpcThis bug impacts ARCH=powerpc[BUG] llvm (main)A bug in an unreleased version of LLVM (this label is appropriate for regressions)[FIXED][LLVM] mainThis bug was only present and fixed in an unreleased version of LLVMboot failureThis issue results in a failure to boot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions