Skip to content

Conversation

@RevySR
Copy link

@RevySR RevySR commented Jul 31, 2025

No description provided.

KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit pushed a commit that referenced this pull request Jul 31, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit pushed a commit that referenced this pull request Aug 1, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 2 times, most recently from 76821ec to e7e5f1f Compare August 3, 2025 21:24
KexyBiscuit pushed a commit that referenced this pull request Aug 3, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 2 times, most recently from e7333bd to 772bca6 Compare August 6, 2025 11:22
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
Patch series "extend hung task blocker tracking to rwsems".

Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>


This patch (of 3):

In preparation for extending blocker tracking to support rwsems, make the
rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
determining if the blocker is a writer or one of the readers.

Additionally, a stale owner pointer in a reader-owned rwsem can lead to
false positives in blocker tracking when CONFIG_DETECT_HUNG_TASK_BLOCKER
is enabled.  To mitigate this, clear the owner field on the reader unlock
path, similar to what CONFIG_DEBUG_RWSEMS does.  A NULL owner is better
than a stale one for diagnostics.

Link: https://lkml.kernel.org/r/20250627072924.36567-1-lance.yang@linux.dev
Link: https://lkml.kernel.org/r/20250627072924.36567-2-lance.yang@linux.dev
Link: https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/ [1]
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Cc: Zi Li <zi.li@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
KexyBiscuit pushed a commit that referenced this pull request Aug 6, 2025
Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>

[1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/

Link: https://lkml.kernel.org/r/20250627072924.36567-3-lance.yang@linux.dev
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Cc: Zi Li <zi.li@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
@KexyBiscuit KexyBiscuit force-pushed the aosc/v6.16 branch 6 times, most recently from e6c3646 to 2934eba Compare August 7, 2025 12:33
@RevySR RevySR force-pushed the aosc/sg2042/v6.16.y branch from fd46b48 to b248cba Compare August 11, 2025 18:00
MingcongBai pushed a commit that referenced this pull request Aug 13, 2025
[ Upstream commit 16d8fd7 ]

In rtl8187_stop() move the call of usb_kill_anchored_urbs() before clearing
b_tx_status.queue. This change prevents callbacks from using already freed
skb due to anchor was not killed before freeing such skb.

 BUG: kernel NULL pointer dereference, address: 0000000000000080
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.15.0 #8 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
 RIP: 0010:ieee80211_tx_status_irqsafe+0x21/0xc0 [mac80211]
 Call Trace:
  <IRQ>
  rtl8187_tx_cb+0x116/0x150 [rtl8187]
  __usb_hcd_giveback_urb+0x9d/0x120
  usb_giveback_urb_bh+0xbb/0x140
  process_one_work+0x19b/0x3c0
  bh_worker+0x1a7/0x210
  tasklet_action+0x10/0x30
  handle_softirqs+0xf0/0x340
  __irq_exit_rcu+0xcd/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>

Tested on RTL8187BvE device.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c1db52b ("rtl8187: Use usb anchor facilities to manage urbs")
Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250617135634.21760-1-d.dulov@aladdin.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai pushed a commit that referenced this pull request Aug 13, 2025
[ Upstream commit a509a55 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai added a commit that referenced this pull request Aug 13, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
MingcongBai pushed a commit that referenced this pull request Aug 17, 2025
[ Upstream commit 16d8fd7 ]

In rtl8187_stop() move the call of usb_kill_anchored_urbs() before clearing
b_tx_status.queue. This change prevents callbacks from using already freed
skb due to anchor was not killed before freeing such skb.

 BUG: kernel NULL pointer dereference, address: 0000000000000080
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.15.0 #8 PREEMPT(voluntary)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
 RIP: 0010:ieee80211_tx_status_irqsafe+0x21/0xc0 [mac80211]
 Call Trace:
  <IRQ>
  rtl8187_tx_cb+0x116/0x150 [rtl8187]
  __usb_hcd_giveback_urb+0x9d/0x120
  usb_giveback_urb_bh+0xbb/0x140
  process_one_work+0x19b/0x3c0
  bh_worker+0x1a7/0x210
  tasklet_action+0x10/0x30
  handle_softirqs+0xf0/0x340
  __irq_exit_rcu+0xcd/0xf0
  common_interrupt+0x85/0xa0
  </IRQ>

Tested on RTL8187BvE device.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c1db52b ("rtl8187: Use usb anchor facilities to manage urbs")
Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250617135634.21760-1-d.dulov@aladdin.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai pushed a commit that referenced this pull request Aug 17, 2025
[ Upstream commit a509a55 ]

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20, ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai added a commit that referenced this pull request Aug 17, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
The sg2042 SoCs support xtheadvector [1] so it can be included in the
devicetree. Also include vlenb for the cpu. And set vlenb=16 [2].

This can be tested by passing the "mitigations=off" kernel parameter.

Link: https://lore.kernel.org/linux-riscv/20241113-xtheadvector-v11-4-236c22791ef9@rivosinc.com/ [1]
Link: https://lore.kernel.org/linux-riscv/aCO44SAoS2kIP61r@ghost/ [2]

Signed-off-by: Han Gao <rabenda.cn@gmail.com>
Reviewed-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/915bef0530dee6c8bc0ae473837a4bd6786fa4fb.1751698574.git.rabenda.cn@gmail.com
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
(cherry picked from commit a5fb905)
Signed-off-by: Han Gao <rabenda.cn@gmail.com>
MingcongBai added a commit that referenced this pull request Oct 3, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 3, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 12, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 12, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 14, 2025
It appears that the xe_res_cursor also assumes 4KiB alignment.

Current implementation uses `PAGE_SIZE' as an assumed alignment reference,
but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged
kernels, this causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Wenbin Fang <fangwenbin@vip.qq.com>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Link: https://t.me/c/1109254909/768552
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 14, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 14, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 14, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 14, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 15, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 17, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai pushed a commit that referenced this pull request Oct 23, 2025
[ Upstream commit 48918ca ]

The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.

Before:
```
$ perf test -vv 7
  7: PERF_RECORD_* events & perf_sample fields:
 --- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
 ------------------------------------------------------------
perf_event_attr:
  type                    	   0 (PERF_TYPE_HARDWARE)
  config                  	   0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                	   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
  exclude_kernel                   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
  exclude_kernel                   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
 ------------------------------------------------------------
perf_event_attr:
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0x9 (PERF_COUNT_SW_DUMMY)
  sample_type                      IP|TID|TIME|CPU
  read_format                      ID|LOST
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  mmap2                            1
  comm_exec                        1
  ksymbol                          1
  bpf_event                        1
  { wakeup_events, wakeup_watermark } 1
 ------------------------------------------------------------
sys_perf_event_open: pid 1189569  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
 ---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
 ---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
    #0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
    #1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
    #2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
    #3 0x7f29ce849cc2 in raise raise.c:27
    #4 0x7f29ce8324ac in abort abort.c:81
    #5 0x565358f662d4 in check_leaks builtin-test.c:226
    #6 0x565358f6682e in run_test_child builtin-test.c:344
    #7 0x565358ef7121 in start_command run-command.c:128
    #8 0x565358f67273 in start_test builtin-test.c:545
    #9 0x565358f6771d in __cmd_test builtin-test.c:647
    #10 0x565358f682bd in cmd_test builtin-test.c:849
    #11 0x565358ee5ded in run_builtin perf.c:349
    #12 0x565358ee6085 in handle_internal_command perf.c:401
    #13 0x565358ee61de in run_argv perf.c:448
    #14 0x565358ee6527 in main perf.c:555
    #15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
    #16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    torvalds#17 0x565358e391c1 in _start perf[851c1]
  7: PERF_RECORD_* events & perf_sample fields                       : FAILED!
```

After:
```
$ perf test 7
  7: PERF_RECORD_* events & perf_sample fields                       : Skip (permissions)
```

Fixes: 16d00fe ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai pushed a commit that referenced this pull request Oct 23, 2025
commit 0570327 upstream.

Before disabling SR-IOV via config space accesses to the parent PF,
sriov_disable() first removes the PCI devices representing the VFs.

Since commit 9d16947 ("PCI: Add global pci_lock_rescan_remove()")
such removal operations are serialized against concurrent remove and
rescan using the pci_rescan_remove_lock. No such locking was ever added
in sriov_disable() however. In particular when commit 18f9e9d
("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device
removal into sriov_del_vfs() there was still no locking around the
pci_iov_remove_virtfn() calls.

On s390 the lack of serialization in sriov_disable() may cause double
remove and list corruption with the below (amended) trace being observed:

  PSW:  0704c00180000000 0000000c914e4b38 (klist_put+56)
  GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001
	00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480
	0000000000000001 0000000000000000 0000000000000000 0000000180692828
	00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8
  #0 [3800313fb20] device_del at c9158ad5c
  #1 [3800313fb88] pci_remove_bus_device at c915105ba
  #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198
  #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0
  #4 [3800313fc60] zpci_bus_remove_device at c90fb6104
  #5 [3800313fca0] __zpci_event_availability at c90fb3dca
  #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2
  #7 [3800313fd60] crw_collect_info at c91905822
  #8 [3800313fe10] kthread at c90feb390
  #9 [3800313fe68] __ret_from_fork at c90f6aa64
  #10 [3800313fe98] ret_from_fork at c9194f3f2.

This is because in addition to sriov_disable() removing the VFs, the
platform also generates hot-unplug events for the VFs. This being the
reverse operation to the hotplug events generated by sriov_enable() and
handled via pdev->no_vf_scan. And while the event processing takes
pci_rescan_remove_lock and checks whether the struct pci_dev still exists,
the lack of synchronization makes this checking racy.

Other races may also be possible of course though given that this lack of
locking persisted so long observable races seem very rare. Even on s390 the
list corruption was only observed with certain devices since the platform
events are only triggered by config accesses after the removal, so as long
as the removal finished synchronously they would not race. Either way the
locking is missing so fix this by adding it to the sriov_del_vfs() helper.

Just like PCI rescan-remove, locking is also missing in sriov_add_vfs()
including for the error case where pci_stop_and_remove_bus_device() is
called without the PCI rescan-remove lock being held. Even in the non-error
case, adding new PCI devices and buses should be serialized via the PCI
rescan-remove lock. Add the necessary locking.

Fixes: 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Julian Ruess <julianr@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
MingcongBai added a commit that referenced this pull request Oct 23, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 23, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 23, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 25, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 27, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Oct 28, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
xry111 pushed a commit that referenced this pull request Nov 3, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Nov 8, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Nov 11, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Nov 26, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Nov 26, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
AirFortressIlikara pushed a commit that referenced this pull request Nov 30, 2025
[WHAT]
IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic
fails with NULL pointer dereference. This can be reproduced with
both an eDP panel and a DP monitors connected.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted
6.16.0-99-custom #8 PREEMPT(voluntary)
 Hardware name: AMD ........
 RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu]
 Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49
 89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30
 c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02
 RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668
 RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000
 RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760
 R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000
 R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c
 FS:  000071f631b68700(0000) GS:ffff8b399f114000(0000)
knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0
 PKRU: 55555554
 Call Trace:
 <TASK>
 dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu]
 amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu]
 ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu]
 amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu]
 drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400
 drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30
 drm_crtc_get_last_vbltimestamp+0x55/0x90
 drm_crtc_next_vblank_start+0x45/0xa0
 drm_atomic_helper_wait_for_fences+0x81/0x1f0
 ...

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 621e55f)
Cc: stable@vger.kernel.org
AirFortressIlikara pushed a commit that referenced this pull request Nov 30, 2025
[ Upstream commit 48918ca ]

The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.

Before:
```
$ perf test -vv 7
  7: PERF_RECORD_* events & perf_sample fields:
 --- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
 ------------------------------------------------------------
perf_event_attr:
  type                    	   0 (PERF_TYPE_HARDWARE)
  config                  	   0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                	   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
  exclude_kernel                   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
 ------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  config                           0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
  disabled                         1
  exclude_kernel                   1
 ------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
 ------------------------------------------------------------
perf_event_attr:
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0x9 (PERF_COUNT_SW_DUMMY)
  sample_type                      IP|TID|TIME|CPU
  read_format                      ID|LOST
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  mmap2                            1
  comm_exec                        1
  ksymbol                          1
  bpf_event                        1
  { wakeup_events, wakeup_watermark } 1
 ------------------------------------------------------------
sys_perf_event_open: pid 1189569  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
 ---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
 ---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
    #0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
    #1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
    #2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
    #3 0x7f29ce849cc2 in raise raise.c:27
    #4 0x7f29ce8324ac in abort abort.c:81
    #5 0x565358f662d4 in check_leaks builtin-test.c:226
    #6 0x565358f6682e in run_test_child builtin-test.c:344
    #7 0x565358ef7121 in start_command run-command.c:128
    #8 0x565358f67273 in start_test builtin-test.c:545
    #9 0x565358f6771d in __cmd_test builtin-test.c:647
    #10 0x565358f682bd in cmd_test builtin-test.c:849
    #11 0x565358ee5ded in run_builtin perf.c:349
    #12 0x565358ee6085 in handle_internal_command perf.c:401
    #13 0x565358ee61de in run_argv perf.c:448
    #14 0x565358ee6527 in main perf.c:555
    #15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
    #16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
    torvalds#17 0x565358e391c1 in _start perf[851c1]
  7: PERF_RECORD_* events & perf_sample fields                       : FAILED!
```

After:
```
$ perf test 7
  7: PERF_RECORD_* events & perf_sample fields                       : Skip (permissions)
```

Fixes: 16d00fe ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
AirFortressIlikara pushed a commit that referenced this pull request Nov 30, 2025
commit 0570327 upstream.

Before disabling SR-IOV via config space accesses to the parent PF,
sriov_disable() first removes the PCI devices representing the VFs.

Since commit 9d16947 ("PCI: Add global pci_lock_rescan_remove()")
such removal operations are serialized against concurrent remove and
rescan using the pci_rescan_remove_lock. No such locking was ever added
in sriov_disable() however. In particular when commit 18f9e9d
("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device
removal into sriov_del_vfs() there was still no locking around the
pci_iov_remove_virtfn() calls.

On s390 the lack of serialization in sriov_disable() may cause double
remove and list corruption with the below (amended) trace being observed:

  PSW:  0704c00180000000 0000000c914e4b38 (klist_put+56)
  GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001
	00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480
	0000000000000001 0000000000000000 0000000000000000 0000000180692828
	00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8
  #0 [3800313fb20] device_del at c9158ad5c
  #1 [3800313fb88] pci_remove_bus_device at c915105ba
  #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198
  #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0
  #4 [3800313fc60] zpci_bus_remove_device at c90fb6104
  #5 [3800313fca0] __zpci_event_availability at c90fb3dca
  #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2
  #7 [3800313fd60] crw_collect_info at c91905822
  #8 [3800313fe10] kthread at c90feb390
  #9 [3800313fe68] __ret_from_fork at c90f6aa64
  #10 [3800313fe98] ret_from_fork at c9194f3f2.

This is because in addition to sriov_disable() removing the VFs, the
platform also generates hot-unplug events for the VFs. This being the
reverse operation to the hotplug events generated by sriov_enable() and
handled via pdev->no_vf_scan. And while the event processing takes
pci_rescan_remove_lock and checks whether the struct pci_dev still exists,
the lack of synchronization makes this checking racy.

Other races may also be possible of course though given that this lack of
locking persisted so long observable races seem very rare. Even on s390 the
list corruption was only observed with certain devices since the platform
events are only triggered by config accesses after the removal, so as long
as the removal finished synchronously they would not race. Either way the
locking is missing so fix this by adding it to the sriov_del_vfs() helper.

Just like PCI rescan-remove, locking is also missing in sriov_add_vfs()
including for the error case where pci_stop_and_remove_bus_device() is
called without the PCI rescan-remove lock being held. Even in the non-error
case, adding new PCI devices and buses should be serialized via the PCI
rescan-remove lock. Add the necessary locking.

Fixes: 18f9e9d ("PCI/IOV: Factor out sriov_add_vfs()")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Julian Ruess <julianr@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
MingcongBai added a commit that referenced this pull request Dec 1, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Dec 2, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
MingcongBai added a commit that referenced this pull request Dec 3, 2025
It appears that the xe_res_cursor also assumes 4K alignment.

Current code uses `PAGE_SIZE' as an assumed alignment reference but 4K
kernel page sizes is by no means a guarantee. On 16K-paged kernels, this
causes driver failures during boot up:

[   23.242757] ------------[ cut here ]------------
[   23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_cursor.h:182 emit_pte+0x394/0x3b0 [xe]
[   23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrtr(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep(E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E) soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d
 rm_gpuvm(E) drm_buddy(E) gpu_sched(E)
[   23.257034]  drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo_bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E)
[   23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.381640] Tainted: [E]=UNSIGNED_MODULE
[   23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 sp 900000011fe3f7e0
[   23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000
[   23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 a7 900000010c947b00
[   23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 900000012e456230
[   23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 t7 0000000000004000
[   23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 s0 0000000000000047
[   23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 s4 ffffffffffffc000
[   23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 s8 900000011fe3f8e0
[   23.465851]    ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe]
[   23.471761]   ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe]
[   23.477557]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   23.483732]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   23.488068]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
[   23.492832]  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[   23.497594] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
[   23.503133]  PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV)
[   23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G            E      6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8
[   23.509168] Tainted: [E]=UNSIGNED_MODULE
[   23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab
[   23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 900000011fe3c000
[   23.509176]         900000011fe3f440 0000000000000000 900000011fe3f448 9000000001c31c70
[   23.509181]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509185]         0000000000000000 5cc06cee8ef0edee 0000000000000000 0000000000000000
[   23.509190]         0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   23.509193]         0000000000000000 0000000000000000 00000000066b4000 9000000100024420
[   23.509197]         9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004
[   23.509202]         0000000000000004 0000000000000000 0000000000000000 0000000000000000
[   23.509206]         900000011fe3f8e0 9000000001c31c70 9000000000244174 00007fffac097534
[   23.509211]         00000000000000b0 0000000000000004 0000000000000003 0000000000071c1d
[   23.509216]         ...
[   23.509218] Call Trace:
[   23.509220] [<9000000000244174>] show_stack+0x3c/0x16c
[   23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0
[   23.509230] [<9000000000288208>] __warn+0x8c/0x174
[   23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c
[   23.509238] [<90000000017f66e8>] do_bp+0x280/0x344
[   23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0
[   23.509247] [<ffff80000251efc0>] emit_pte+0x394/0x3b0 [xe]
[   23.509295] [<ffff800002520d38>] xe_migrate_clear+0x2d8/0xa54 [xe]
[   23.509341] [<ffff8000024e6c38>] xe_bo_move+0x324/0x930 [xe]
[   23.509387] [<ffff800002209468>] ttm_bo_handle_move_mem+0xd0/0x194 [ttm]
[   23.509392] [<ffff800002209ebc>] ttm_bo_validate+0xd4/0x1cc [ttm]
[   23.509396] [<ffff80000220a138>] ttm_bo_init_reserved+0x184/0x1dc [ttm]
[   23.509399] [<ffff8000024e7840>] ___xe_bo_create_locked+0x1e8/0x3d4 [xe]
[   23.509445] [<ffff8000024e7cf8>] __xe_bo_create_locked+0x2cc/0x390 [xe]
[   23.509489] [<ffff8000024e7e98>] xe_bo_create_user+0x34/0xe4 [xe]
[   23.509533] [<ffff8000024e875c>] xe_gem_create_ioctl+0x154/0x4d8 [xe]
[   23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c
[   23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4
[   23.509585] [<ffff8000024ea778>] xe_drm_ioctl+0x64/0xac [xe]
[   23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98
[   23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140
[   23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158
[   23.509640]
[   23.509644] ---[ end trace 0000000000000000 ]---

Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use
`XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of
`PAGE_SIZE' in relevant code.

Cc: stable@vger.kernel.org
Fixes: e89b384 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Haien Liang <27873200@qq.com>
Tested-by: Shirong Liu <lsr1024@qq.com>
Tested-by: Haofeng Wu <s2600cw2@126.com>
Link: FanFansfan@22c55ab
Co-developed-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Shang Yatsen <429839446@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io/
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants