Skip to content

[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test #125711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 13, 2025

Conversation

ritter-x2a
Copy link
Member

@ritter-x2a ritter-x2a commented Feb 4, 2025

[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test

gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines (unless there is already a RUN-line for gfx942).

The only notable difference in the test output is that gfx942 does not force the use of sc0 and sc1 on stores while gfx940 and gfx941 do (cf. https://reviews.llvm.org/D149986).

For SWDEV-512631

@llvmbot
Copy link
Member

llvmbot commented Feb 4, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-backend-amdgpu

Author: Fabian Ritter (ritter-x2a)

Changes

[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test

gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines (unless there is already a RUN-line for gfx942).

The only notable difference in the test output is that gfx942 does not force the use of sc0 and sc1 on stores while gfx940 and gfx941 do (cf. https://reviews.llvm.org/D149986).


Patch is 31.19 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125711.diff

276 Files Affected:

  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll (+224-224)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmin.ll (+224-224)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-no-rtn.ll (+116-116)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-rtn.ll (+124-124)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f64.ll (+310-310)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-no-rtn.ll (+116-116)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-rtn.ll (+124-124)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.f32.ll (+43-43)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.f64.ll (+32-32)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-atomic-fadd.v2f16.ll (+45-45)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll (+736-736)
  • (removed) llvm/test/CodeGen/AMDGPU/GlobalISel/fp-atomics-gfx940.ll (-132)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/fp-atomics-gfx942.ll (+132)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll (+629-629)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-no-rtn.ll (+82-82)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-rtn.ll (+102-102)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f64.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-no-rtn.ll (+23-23)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.v2f16-rtn.ll (+25-25)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-atomicrmw.ll (+1-1)
  • (renamed) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.mfma.gfx942.mir (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir (+314-314)
  • (modified) llvm/test/CodeGen/AMDGPU/amdhsa-kernarg-preload-num-sgprs.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll (+27-27)
  • (modified) llvm/test/CodeGen/AMDGPU/back-off-barrier-subtarget-feature.ll (+20-20)
  • (modified) llvm/test/CodeGen/AMDGPU/bf16-conversions.ll (+242-242)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-no-rtn.ll (+142-142)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-rtn.ll (+150-150)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f64.ll (+360-360)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-no-rtn.ll (+142-142)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-rtn.ll (+150-150)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fadd.ll (+914-914)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmax.ll (+839-839)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmin.ll (+839-839)
  • (modified) llvm/test/CodeGen/AMDGPU/build_vector.ll (+51-51)
  • (modified) llvm/test/CodeGen/AMDGPU/copy_phys_vgpr64.mir (+103-103)
  • (modified) llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll (-12)
  • (modified) llvm/test/CodeGen/AMDGPU/dpp64_combine.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/dpp64_combine.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/elf-header-flags-mach.ll (-4)
  • (modified) llvm/test/CodeGen/AMDGPU/elf-header-flags-sramecc.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir (+135-135)
  • (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.f32.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomic-fadd.f64.ll (+122-122)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll (+1394-1394)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll (+2044-2044)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll (+2044-2044)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll (+1904-1904)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll (+394-394)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch.ll (+438-438)
  • (modified) llvm/test/CodeGen/AMDGPU/fmaximum3.ll (+1194-1194)
  • (modified) llvm/test/CodeGen/AMDGPU/fminimum3.ll (+1194-1194)
  • (modified) llvm/test/CodeGen/AMDGPU/fold-agpr-phis.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-clear-kill-flags.mir (+1-1)
  • (renamed) llvm/test/CodeGen/AMDGPU/fp-atomics-gfx942.ll (+62-62)
  • (modified) llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll (+583-583)
  • (renamed) llvm/test/CodeGen/AMDGPU/gfx942-hazards.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f32-no-rtn.ll (+74-74)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f32-rtn.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomic-fadd.f64.ll (+66-66)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomic-fadd.v2f16-no-rtn.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomic-fadd.v2f16-rtn.ll (+52-52)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll (+1314-1314)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll (+1747-1747)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll (+1747-1747)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll (+1691-1691)
  • (modified) llvm/test/CodeGen/AMDGPU/idemponent-atomics.ll (+97-97)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2bf16.ll (+318-318)
  • (modified) llvm/test/CodeGen/AMDGPU/lds-dma-hazards.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/lds-dma-waitcnt.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/lds-limit-diagnostics.ll (-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.global.load.lds.gfx950.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.global.load.lds.ll (+37-37)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx90a.ll (+13-13)
  • (renamed) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx942.ll (+54-54)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.ll (+52-52)
  • (renamed) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.xf32.gfx942.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane16.swap.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane32.swap.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ptr.buffer.atomic.fadd_rtn_errors.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.buffer.atomic.fadd_nortn.ll (+31-31)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.buffer.atomic.fadd_rtn.ll (+31-31)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.buffer.load.lds.gfx950.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fadd_nortn.ll (+97-97)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fadd_rtn.ll (+97-97)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fmax.f64.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fmin.f64.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.load.lds.gfx950.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.udot2.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.udot4.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.udot8.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/local-atomicrmw-fadd.ll (+650-650)
  • (modified) llvm/test/CodeGen/AMDGPU/local-atomicrmw-fmax.ll (+689-689)
  • (modified) llvm/test/CodeGen/AMDGPU/local-atomicrmw-fmin.ll (+689-689)
  • (modified) llvm/test/CodeGen/AMDGPU/local-atomicrmw-fsub.ll (+796-796)
  • (modified) llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx8.mir (+134-134)
  • (modified) llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx9.mir (+70-70)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-nontemporal-metadata.ll (+119-119)
  • (modified) llvm/test/CodeGen/AMDGPU/lshl-add-u64.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/madak.ll (+144-144)
  • (renamed) llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir (+68-68)
  • (modified) llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.gfx10.ll (+235-235)
  • (modified) llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.ll (+264-264)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-fence-mmra-global.ll (+241-241)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-fence-mmra-local.ll (+158-158)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-fence.ll (+341-341)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-agent.ll (+2932-2932)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-nontemporal.ll (+149-149)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-singlethread.ll (+2478-2478)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-system.ll (+2932-2932)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-wavefront.ll (+2445-2445)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-workgroup.ll (+2574-2574)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-agent.ll (+2787-2787)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-nontemporal.ll (+124-124)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-singlethread.ll (+2390-2390)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-system.ll (+2647-2647)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-wavefront.ll (+2390-2390)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-global-workgroup.ll (+2577-2577)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll (+2348-2348)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-nontemporal.ll (+135-135)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-singlethread.ll (+2238-2238)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-system.ll (+2348-2348)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-wavefront.ll (+2238-2238)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local-workgroup.ll (+2348-2348)
  • (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-private-nontemporal.ll (+124-124)
  • (modified) llvm/test/CodeGen/AMDGPU/mfma-cd-select.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/mfma-loop.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/mfma-no-register-aliasing.ll (+1-1)
  • (renamed) llvm/test/CodeGen/AMDGPU/mfma-vgpr-cd-select-gfx942.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/mfma-vgpr-cd-select.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/neighboring-mfma-padding.mir (+233-233)
  • (modified) llvm/test/CodeGen/AMDGPU/packed-fp32.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/preload-implicit-kernargs-IR-lowering.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/preload-implicit-kernargs.ll (+333-333)
  • (modified) llvm/test/CodeGen/AMDGPU/preload-kernargs-IR-lowering.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/preload-kernargs.ll (+564-564)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2bf16.v2bf16.ll (+500-500)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2bf16.v3bf16.ll (+1103-1103)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2bf16.v4bf16.ll (+1937-1937)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2bf16.v8bf16.ll (+7427-7427)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f16.v2f16.ll (+500-500)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f16.v3f16.ll (+1103-1103)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f16.v4f16.ll (+1937-1937)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f16.v8f16.ll (+7427-7427)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f32.v2f32.ll (+454-454)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f32.v3f32.ll (+1012-1012)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f32.v4f32.ll (+1788-1788)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2f32.v8f32.ll (+6783-6783)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i16.v2i16.ll (+493-493)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i16.v3i16.ll (+1092-1092)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i16.v4i16.ll (+1899-1899)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i16.v8i16.ll (+7255-7255)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i32.v2i32.ll (+454-454)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i32.v3i32.ll (+1012-1012)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i32.v4i32.ll (+1788-1788)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i32.v8i32.ll (+6783-6783)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i64.v2i64.ll (+486-486)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i64.v3i64.ll (+1076-1076)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i64.v4i64.ll (+1775-1775)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2i64.v8i64.ll (+7513-7513)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p0.v2p0.ll (+486-486)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p0.v3p0.ll (+1076-1076)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p0.v4p0.ll (+1775-1775)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p3.v2p3.ll (+454-454)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p3.v3p3.ll (+1012-1012)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p3.v4p3.ll (+1788-1788)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v2p3.v8p3.ll (+6783-6783)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3bf16.v2bf16.ll (+1016-1016)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3bf16.v3bf16.ll (+2252-2252)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3bf16.v4bf16.ll (+4112-4112)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f16.v2f16.ll (+1016-1016)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f16.v3f16.ll (+2252-2252)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f16.v4f16.ll (+4112-4112)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f32.v2f32.ll (+990-990)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f32.v3f32.ll (+2112-2112)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3f32.v4f32.ll (+3777-3777)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i16.v2i16.ll (+987-987)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i16.v3i16.ll (+2212-2212)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i16.v4i16.ll (+4011-4011)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i32.v2i32.ll (+990-990)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i32.v3i32.ll (+2112-2112)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i32.v4i32.ll (+3777-3777)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i64.v2i64.ll (+1092-1092)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i64.v3i64.ll (+2289-2289)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3i64.v4i64.ll (+4119-4119)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p0.v2p0.ll (+1092-1092)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p0.v3p0.ll (+2289-2289)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p0.v4p0.ll (+4119-4119)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p3.v2p3.ll (+990-990)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p3.v3p3.ll (+2112-2112)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v3p3.v4p3.ll (+3777-3777)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4bf16.v2bf16.ll (+1757-1757)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4bf16.v3bf16.ll (+3745-3745)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4bf16.v4bf16.ll (+6649-6649)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f16.v2f16.ll (+1757-1757)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f16.v3f16.ll (+3745-3745)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f16.v4f16.ll (+6649-6649)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f32.v2f32.ll (+1279-1279)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f32.v3f32.ll (+3531-3531)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4f32.v4f32.ll (+5984-5984)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i16.v2i16.ll (+1621-1621)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i16.v3i16.ll (+3640-3640)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i16.v4i16.ll (+6339-6339)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i32.v2i32.ll (+1281-1281)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i32.v3i32.ll (+3531-3531)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i32.v4i32.ll (+5984-5984)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v2i64.ll (+1526-1526)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v3i64.ll (+4091-4091)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v4i64.ll (+6967-6967)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v2p0.ll (+1526-1526)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v3p0.ll (+4091-4091)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v4p0.ll (+6967-6967)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p3.v2p3.ll (+1281-1281)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p3.v3p3.ll (+3531-3531)
  • (modified) llvm/test/CodeGen/AMDGPU/shufflevector.v4p3.v4p3.ll (+5984-5984)
  • (modified) llvm/test/CodeGen/AMDGPU/smfmac_no_agprs.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/uniform-select.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/unsupported-image-sample.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/v_mov_b64_expansion.mir (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll (+778-970)
  • (modified) llvm/test/CodeGen/AMDGPU/verifier-sdwa-cvt.mir (+2-2)
  • (modified) llvm/test/MC/AMDGPU/amdhsa-kd-kernarg-preload.s (+2-2)
  • (modified) llvm/test/MC/AMDGPU/extrasgprs_mcexpr.s (+6-6)
  • (renamed) llvm/test/MC/AMDGPU/flat-scratch-gfx942.s (+354-354)
  • (removed) llvm/test/MC/AMDGPU/gfx940_err.s (-127)
  • (renamed) llvm/test/MC/AMDGPU/gfx942_asm_features.s (+313-313)
  • (added) llvm/test/MC/AMDGPU/gfx942_err.s (+127)
  • (renamed) llvm/test/MC/AMDGPU/gfx942_err_pos.s (+1-1)
  • (renamed) llvm/test/MC/AMDGPU/gfx942_unsupported.s (+1-1)
  • (modified) llvm/test/MC/AMDGPU/gfx950_asm_features.s (+1-1)
  • (modified) llvm/test/MC/AMDGPU/gfx950_asm_read_tr.s (+11-11)
  • (modified) llvm/test/MC/AMDGPU/gfx950_asm_vop1.s (+33-33)
  • (modified) llvm/test/MC/AMDGPU/gfx950_asm_vop3.s (+25-25)
  • (renamed) llvm/test/MC/AMDGPU/mai-err-gfx942.s (+16-16)
  • (renamed) llvm/test/MC/AMDGPU/mai-gfx942.s (+192-192)
  • (modified) llvm/test/MC/AMDGPU/mai-gfx950.s (+1-1)
  • (renamed) llvm/test/MC/AMDGPU/mimg-err-gfx942.s (+27-27)
  • (modified) llvm/test/MC/AMDGPU/mubuf-gfx950.s (+1-1)
  • (modified) llvm/test/MC/AMDGPU/writelane_m0.s (+1-1)
  • (modified) llvm/test/MC/AMDGPU/xdl-insts-gfx908.s (+1-1)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx908-xdl-insts.txt (+1-1)
  • (renamed) llvm/test/MC/Disassembler/AMDGPU/gfx942_features.txt (+183-183)
  • (renamed) llvm/test/MC/Disassembler/AMDGPU/gfx942_flat.txt (+353-353)
  • (renamed) llvm/test/MC/Disassembler/AMDGPU/gfx942_mai.txt (+200-200)
  • (modified) llvm/test/MachineVerifier/AMDGPU/writelane_m0.mir (+1-1)
  • (modified) llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml (-14)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-agent.ll (+354-354)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-system.ll (+274-274)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-agent.ll (+58-58)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-system.ll (+46-46)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll (+2-2)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-agent.ll (+3-3)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-system.ll (+3-3)
  • (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-agent.ll (+3-3)
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
index 424388a30e99b41..d1a303b41deefe5 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx1200 < %s | FileCheck -check-prefix=GFX12 %s
-; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx940 < %s | FileCheck -check-prefix=GFX940 %s
+; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx942 < %s | FileCheck -check-prefix=GFX942 %s
 ; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx1100 < %s | FileCheck -check-prefix=GFX11 %s
 ; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx1010 < %s | FileCheck -check-prefix=GFX10 %s
 ; RUN: llc -global-isel -mtriple=amdgcn-amd-amdpal -mcpu=gfx90a < %s | FileCheck -check-prefix=GFX90A %s
@@ -24,12 +24,12 @@ define float @local_atomic_fmax_ret_f32(ptr addrspace(3) %ptr, float %val) {
 ; GFX12-NEXT:    global_inv scope:SCOPE_SE
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: local_atomic_fmax_ret_f32:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    ds_max_rtn_f32 v0, v0, v1
-; GFX940-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: local_atomic_fmax_ret_f32:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    ds_max_rtn_f32 v0, v0, v1
+; GFX942-NEXT:    s_waitcnt lgkmcnt(0)
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: local_atomic_fmax_ret_f32:
 ; GFX11:       ; %bb.0:
@@ -96,12 +96,12 @@ define void @local_atomic_fmax_noret_f32(ptr addrspace(3) %ptr, float %val) {
 ; GFX12-NEXT:    global_inv scope:SCOPE_SE
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: local_atomic_fmax_noret_f32:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    ds_max_f32 v0, v1
-; GFX940-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: local_atomic_fmax_noret_f32:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    ds_max_f32 v0, v1
+; GFX942-NEXT:    s_waitcnt lgkmcnt(0)
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: local_atomic_fmax_noret_f32:
 ; GFX11:       ; %bb.0:
@@ -168,14 +168,14 @@ define double @local_atomic_fmax_ret_f64(ptr addrspace(3) %ptr, double %val) {
 ; GFX12-NEXT:    global_inv scope:SCOPE_SE
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: local_atomic_fmax_ret_f64:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v4, v1
-; GFX940-NEXT:    v_mov_b32_e32 v5, v2
-; GFX940-NEXT:    ds_max_rtn_f64 v[0:1], v0, v[4:5]
-; GFX940-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: local_atomic_fmax_ret_f64:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    v_mov_b32_e32 v4, v1
+; GFX942-NEXT:    v_mov_b32_e32 v5, v2
+; GFX942-NEXT:    ds_max_rtn_f64 v[0:1], v0, v[4:5]
+; GFX942-NEXT:    s_waitcnt lgkmcnt(0)
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: local_atomic_fmax_ret_f64:
 ; GFX11:       ; %bb.0:
@@ -244,14 +244,14 @@ define void @local_atomic_fmax_noret_f64(ptr addrspace(3) %ptr, double %val) {
 ; GFX12-NEXT:    global_inv scope:SCOPE_SE
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: local_atomic_fmax_noret_f64:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v4, v1
-; GFX940-NEXT:    v_mov_b32_e32 v5, v2
-; GFX940-NEXT:    ds_max_f64 v0, v[4:5]
-; GFX940-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: local_atomic_fmax_noret_f64:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    v_mov_b32_e32 v4, v1
+; GFX942-NEXT:    v_mov_b32_e32 v5, v2
+; GFX942-NEXT:    ds_max_f64 v0, v[4:5]
+; GFX942-NEXT:    s_waitcnt lgkmcnt(0)
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: local_atomic_fmax_noret_f64:
 ; GFX11:       ; %bb.0:
@@ -320,30 +320,30 @@ define float @global_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory(pt
 ; GFX12-NEXT:    global_inv scope:SCOPE_DEV
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: global_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    global_load_dword v3, v[0:1], off
-; GFX940-NEXT:    s_mov_b64 s[0:1], 0
-; GFX940-NEXT:    v_max_f32_e32 v2, v2, v2
-; GFX940-NEXT:  .LBB4_1: ; %atomicrmw.start
-; GFX940-NEXT:    ; =>This Inner Loop Header: Depth=1
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v5, v3
-; GFX940-NEXT:    v_max_f32_e32 v3, v5, v5
-; GFX940-NEXT:    v_max_f32_e32 v4, v3, v2
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    global_atomic_cmpswap v3, v[0:1], v[4:5], off sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    v_cmp_eq_u32_e32 vcc, v3, v5
-; GFX940-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
-; GFX940-NEXT:    s_andn2_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_cbranch_execnz .LBB4_1
-; GFX940-NEXT:  ; %bb.2: ; %atomicrmw.end
-; GFX940-NEXT:    s_or_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    v_mov_b32_e32 v0, v3
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: global_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    global_load_dword v3, v[0:1], off
+; GFX942-NEXT:    s_mov_b64 s[0:1], 0
+; GFX942-NEXT:    v_max_f32_e32 v2, v2, v2
+; GFX942-NEXT:  .LBB4_1: ; %atomicrmw.start
+; GFX942-NEXT:    ; =>This Inner Loop Header: Depth=1
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    v_mov_b32_e32 v5, v3
+; GFX942-NEXT:    v_max_f32_e32 v3, v5, v5
+; GFX942-NEXT:    v_max_f32_e32 v4, v3, v2
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    global_atomic_cmpswap v3, v[0:1], v[4:5], off sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    v_cmp_eq_u32_e32 vcc, v3, v5
+; GFX942-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
+; GFX942-NEXT:    s_andn2_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_cbranch_execnz .LBB4_1
+; GFX942-NEXT:  ; %bb.2: ; %atomicrmw.end
+; GFX942-NEXT:    s_or_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    v_mov_b32_e32 v0, v3
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: global_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -466,29 +466,29 @@ define void @global_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory(p
 ; GFX12-NEXT:    global_inv scope:SCOPE_DEV
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: global_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    global_load_dword v3, v[0:1], off
-; GFX940-NEXT:    s_mov_b64 s[0:1], 0
-; GFX940-NEXT:    v_max_f32_e32 v4, v2, v2
-; GFX940-NEXT:  .LBB5_1: ; %atomicrmw.start
-; GFX940-NEXT:    ; =>This Inner Loop Header: Depth=1
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    v_max_f32_e32 v2, v3, v3
-; GFX940-NEXT:    v_max_f32_e32 v2, v2, v4
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    global_atomic_cmpswap v2, v[0:1], v[2:3], off sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    v_cmp_eq_u32_e32 vcc, v2, v3
-; GFX940-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
-; GFX940-NEXT:    v_mov_b32_e32 v3, v2
-; GFX940-NEXT:    s_andn2_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_cbranch_execnz .LBB5_1
-; GFX940-NEXT:  ; %bb.2: ; %atomicrmw.end
-; GFX940-NEXT:    s_or_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: global_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    global_load_dword v3, v[0:1], off
+; GFX942-NEXT:    s_mov_b64 s[0:1], 0
+; GFX942-NEXT:    v_max_f32_e32 v4, v2, v2
+; GFX942-NEXT:  .LBB5_1: ; %atomicrmw.start
+; GFX942-NEXT:    ; =>This Inner Loop Header: Depth=1
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    v_max_f32_e32 v2, v3, v3
+; GFX942-NEXT:    v_max_f32_e32 v2, v2, v4
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    global_atomic_cmpswap v2, v[0:1], v[2:3], off sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    v_cmp_eq_u32_e32 vcc, v2, v3
+; GFX942-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
+; GFX942-NEXT:    v_mov_b32_e32 v3, v2
+; GFX942-NEXT:    s_andn2_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_cbranch_execnz .LBB5_1
+; GFX942-NEXT:  ; %bb.2: ; %atomicrmw.end
+; GFX942-NEXT:    s_or_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: global_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -626,14 +626,14 @@ define double @global_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory(p
 ; GFX12-NEXT:    v_dual_mov_b32 v0, v4 :: v_dual_mov_b32 v1, v5
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: global_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    global_atomic_max_f64 v[0:1], v[0:1], v[2:3], off sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: global_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    global_atomic_max_f64 v[0:1], v[0:1], v[2:3], off sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: global_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -781,14 +781,14 @@ define void @global_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory(p
 ; GFX12-NEXT:    s_or_b32 exec_lo, exec_lo, s0
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: global_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    global_atomic_max_f64 v[0:1], v[2:3], off
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: global_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    global_atomic_max_f64 v[0:1], v[2:3], off
+; GFX942-NEXT:    s_waitcnt vmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: global_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -911,30 +911,30 @@ define float @flat_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory(ptr
 ; GFX12-NEXT:    global_inv scope:SCOPE_DEV
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: flat_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    flat_load_dword v3, v[0:1]
-; GFX940-NEXT:    s_mov_b64 s[0:1], 0
-; GFX940-NEXT:    v_max_f32_e32 v2, v2, v2
-; GFX940-NEXT:  .LBB8_1: ; %atomicrmw.start
-; GFX940-NEXT:    ; =>This Inner Loop Header: Depth=1
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v5, v3
-; GFX940-NEXT:    v_max_f32_e32 v3, v5, v5
-; GFX940-NEXT:    v_max_f32_e32 v4, v3, v2
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    flat_atomic_cmpswap v3, v[0:1], v[4:5] sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    v_cmp_eq_u32_e32 vcc, v3, v5
-; GFX940-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
-; GFX940-NEXT:    s_andn2_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_cbranch_execnz .LBB8_1
-; GFX940-NEXT:  ; %bb.2: ; %atomicrmw.end
-; GFX940-NEXT:    s_or_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    v_mov_b32_e32 v0, v3
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: flat_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    flat_load_dword v3, v[0:1]
+; GFX942-NEXT:    s_mov_b64 s[0:1], 0
+; GFX942-NEXT:    v_max_f32_e32 v2, v2, v2
+; GFX942-NEXT:  .LBB8_1: ; %atomicrmw.start
+; GFX942-NEXT:    ; =>This Inner Loop Header: Depth=1
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    v_mov_b32_e32 v5, v3
+; GFX942-NEXT:    v_max_f32_e32 v3, v5, v5
+; GFX942-NEXT:    v_max_f32_e32 v4, v3, v2
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    flat_atomic_cmpswap v3, v[0:1], v[4:5] sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    v_cmp_eq_u32_e32 vcc, v3, v5
+; GFX942-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
+; GFX942-NEXT:    s_andn2_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_cbranch_execnz .LBB8_1
+; GFX942-NEXT:  ; %bb.2: ; %atomicrmw.end
+; GFX942-NEXT:    s_or_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    v_mov_b32_e32 v0, v3
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: flat_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -1053,29 +1053,29 @@ define void @flat_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory(ptr
 ; GFX12-NEXT:    global_inv scope:SCOPE_DEV
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: flat_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    flat_load_dword v3, v[0:1]
-; GFX940-NEXT:    s_mov_b64 s[0:1], 0
-; GFX940-NEXT:    v_max_f32_e32 v4, v2, v2
-; GFX940-NEXT:  .LBB9_1: ; %atomicrmw.start
-; GFX940-NEXT:    ; =>This Inner Loop Header: Depth=1
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_max_f32_e32 v2, v3, v3
-; GFX940-NEXT:    v_max_f32_e32 v2, v2, v4
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    flat_atomic_cmpswap v2, v[0:1], v[2:3] sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    v_cmp_eq_u32_e32 vcc, v2, v3
-; GFX940-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
-; GFX940-NEXT:    v_mov_b32_e32 v3, v2
-; GFX940-NEXT:    s_andn2_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_cbranch_execnz .LBB9_1
-; GFX940-NEXT:  ; %bb.2: ; %atomicrmw.end
-; GFX940-NEXT:    s_or_b64 exec, exec, s[0:1]
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: flat_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    flat_load_dword v3, v[0:1]
+; GFX942-NEXT:    s_mov_b64 s[0:1], 0
+; GFX942-NEXT:    v_max_f32_e32 v4, v2, v2
+; GFX942-NEXT:  .LBB9_1: ; %atomicrmw.start
+; GFX942-NEXT:    ; =>This Inner Loop Header: Depth=1
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    v_max_f32_e32 v2, v3, v3
+; GFX942-NEXT:    v_max_f32_e32 v2, v2, v4
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    flat_atomic_cmpswap v2, v[0:1], v[2:3] sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    v_cmp_eq_u32_e32 vcc, v2, v3
+; GFX942-NEXT:    s_or_b64 s[0:1], vcc, s[0:1]
+; GFX942-NEXT:    v_mov_b32_e32 v3, v2
+; GFX942-NEXT:    s_andn2_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_cbranch_execnz .LBB9_1
+; GFX942-NEXT:  ; %bb.2: ; %atomicrmw.end
+; GFX942-NEXT:    s_or_b64 exec, exec, s[0:1]
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: flat_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -1212,14 +1212,14 @@ define double @flat_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory(ptr
 ; GFX12-NEXT:    v_dual_mov_b32 v0, v4 :: v_dual_mov_b32 v1, v5
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: flat_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    flat_atomic_max_f64 v[0:1], v[0:1], v[2:3] sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: flat_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    flat_atomic_max_f64 v[0:1], v[0:1], v[2:3] sc0
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: flat_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -1365,14 +1365,14 @@ define void @flat_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory(ptr
 ; GFX12-NEXT:    s_or_b32 exec_lo, exec_lo, s0
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: flat_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    flat_atomic_max_f64 v[0:1], v[2:3]
-; GFX940-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: flat_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_wbl2 sc1
+; GFX942-NEXT:    flat_atomic_max_f64 v[0:1], v[2:3]
+; GFX942-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    buffer_inv sc1
+; GFX942-NEXT:    s_setpc_b64 s[30:31]
 ;
 ; GFX11-LABEL: flat_agent_atomic_fmax_noret_f64__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
@@ -1497,32 +1497,32 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX12-NEXT:    global_inv scope:SCOPE_DEV
 ; GFX12-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX940-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
-; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v2, s16
-; GFX940-NEXT:    v_mov_b32_e32 v1, v0
-; GFX940-NEXT:    buffer_load_dword v0, v2, s[0:3], 0 offen
-; GFX940-NEXT:    s_mov_b64 s[4:5], 0
-; GFX940-NEXT:    v_max_f32_e32 v3, v1, v1
-; GFX940-NEXT:  .LBB12_1: ; %atomicrmw.start
-; GFX940-NEXT:    ; =>This Inner Loop Header: Depth=1
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v5, v0
-; GFX940-NEXT:    v_max_f32_e32 v0, v5, v5
-; GFX940-NEXT:    v_max_f32_e32 v4, v0, v3
-; GFX940-NEXT:    v_mov_b64_e32 v[0:1], v[4:5]
-; GFX940-NEXT:    buffer_wbl2 sc1
-; GFX940-NEXT:    buffer_atomic_cmpswap v[0:1], v2, s[0:3], 0 offen sc0
-; GFX940-NEXT:    s_waitcnt vmcnt(0)
-; GFX940-NEXT:    buffer_inv sc1
-; GFX940-NEXT:    v_cmp_eq_u32_e32 vcc, v0, v5
-; GFX940-NEXT:    s_or_b64 s[4:5], vcc, s[4:5]
-; GFX940-NEXT:    s_andn2_b64 exec, exec, s[4:5]
-; GFX940-NEXT:    s_cbranch_execnz .LBB12_1
-; GFX940-NEXT:  ; %bb.2: ; %atomicrmw.end
-; GFX940-NEXT:    s_or_b64 exec, exec, s[4:5]
-; GFX940-NEXT:    s_setpc_b64 s[30:31]
+; GFX942-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
+; GFX942:       ; %bb.0:
+; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:    v_mov_b32_e32 v2, s16
+; GFX942-NEXT:    v_mov_b32_e32 v1, v0
+; GFX942-NEXT:    buffer_load_dword v0, v2, s[0:3], 0 offen
+; GFX942-NEXT:    s_mov_b64 s[4:5], 0
+; GFX942-NEXT:    v_max_f32_e32 v3, v1, v1
+; GFX942-NEXT:  .LBB12_1: ; %...
[truncated]

@ritter-x2a ritter-x2a marked this pull request as ready for review February 4, 2025 16:42
Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check every single file because it took GitHub forever to load them. It looks fine after looking into 10+ files so I'll stamp green on it.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got no stake here, but it seems logical to me for dumping tools to still be able to identify obsolete flavours of AMDGPU, so that if somebody needs to inspect one of these objects, they'd be able to see that it is the obsolete version.

Specifically, I'm thinking the changes in the Object/AMDGPU/elf-header-flags-mach.yaml and tools/llvm-readobj/ELF/AMDGPU/elf-headers.test tests should be dropped. I'd say the same for the llvm-objdump test, but I expect that's less viable, since it requires llc support to generate the input (could it be switched to yaml2obj instead?).

@ritter-x2a
Copy link
Member Author

I've got no stake here, but it seems logical to me for dumping tools to still be able to identify obsolete flavours of AMDGPU, so that if somebody needs to inspect one of these objects, they'd be able to see that it is the obsolete version.

Specifically, I'm thinking the changes in the Object/AMDGPU/elf-header-flags-mach.yaml and tools/llvm-readobj/ELF/AMDGPU/elf-headers.test tests should be dropped. I'd say the same for the llvm-objdump test, but I expect that's less viable, since it requires llc support to generate the input (could it be switched to yaml2obj instead?).

Makes sense to me, I'll bring it up for an internal discussion. Thanks for pointing that out, @jh7370!

@ritter-x2a
Copy link
Member Author

I've got no stake here, but it seems logical to me for dumping tools to still be able to identify obsolete flavours of AMDGPU, so that if somebody needs to inspect one of these objects, they'd be able to see that it is the obsolete version.
Specifically, I'm thinking the changes in the Object/AMDGPU/elf-header-flags-mach.yaml and tools/llvm-readobj/ELF/AMDGPU/elf-headers.test tests should be dropped. I'd say the same for the llvm-objdump test, but I expect that's less viable, since it requires llc support to generate the input (could it be switched to yaml2obj instead?).

Makes sense to me, I'll bring it up for an internal discussion. Thanks for pointing that out, @jh7370!

Per further discussion, we found it best to also drop these targets from the elf tooling.

@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/rm-gfx940-gfx941-llvm-test branch from ba30315 to d50aa30 Compare February 11, 2025 09:26
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines
(unless there is already a RUN-line for gfx942).
Mainly remove sc0 sc1 flags from memory writes since gfx942 does not
force them, in contrast to gfx940 and gfx941.
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/rm-gfx940-gfx941-llvm-test branch from d50aa30 to 9288ce5 Compare February 13, 2025 09:31
Copy link
Member Author

ritter-x2a commented Feb 13, 2025

Merge activity

  • Feb 13, 9:15 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Feb 13, 9:17 AM EST: A user merged this pull request with Graphite.

@ritter-x2a ritter-x2a merged commit a33a84e into main Feb 13, 2025
8 checks passed
@ritter-x2a ritter-x2a deleted the users/ritter-x2a/rm-gfx940-gfx941-llvm-test branch February 13, 2025 14:17
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
…m#125711)

[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test

gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines (unless there is already a RUN-line for gfx942).

The only notable difference in the test output is that gfx942 does not force the use of sc0 and sc1 on stores while gfx940 and gfx941 do (cf. https://reviews.llvm.org/D149986).

For SWDEV-512631
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…m#125711)

[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test

gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines (unless there is already a RUN-line for gfx942).

The only notable difference in the test output is that gfx942 does not force the use of sc0 and sc1 on stores while gfx940 and gfx941 do (cf. https://reviews.llvm.org/D149986).

For SWDEV-512631
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants