Skip to content

[AMDGPU] Wrong code depending on placement of a multiplication #61422

@jdoerfert

Description

@jdoerfert

As shown below, I have created two version that only differ in the placement of %i86.
Both positions are equaly valid.
Note that the %i8Xb values are unused and will not make it into the DAG.

Screenshot 2023-03-14 at 4 13 56 PM

The version with a sunk %i86 will later run into a trap it should not run into.
See {running,trapping}_close.ll as well as the .s and .out files for information.
Backend is run with O1. I also left comments in the code about other versions that trap or not.
Especially close to the trap 3 versions are described that may expose the same or a different issue.

The executable should be able to reproduce this on a gfx90a with a dynamic build of LLVM/OpenMP offloading.

LIBOMPTARGET_JIT_OPT_LEVEL=1 LIBOMPTARGET_JIT_REPLACEMENT_MODULE=trapping_close.ll LIBOMPTARGET_JIT_SKIP_OPT=1 ./check_spo_batched_reduction -n 1 -s 1 -w 2

Found while reducing #60937

repro_close.tar.gz

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions