[AMDGPU] Wrong code depending on placement of a multiplication


As shown below, I have created two version that only differ in the placement of `%i86`.
Both positions are equaly valid. 
Note that the `%i8Xb` values are unused and will not make it into the DAG.

<img width="1842" alt="Screenshot 2023-03-14 at 4 13 56 PM" src="https://user-images.githubusercontent.com/6248191/225163629-b729f350-111f-42dd-9162-b25f88d18e87.png">

The version with a sunk `%i86` will later run into a trap it should not run into. 
See `{running,trapping}_close.ll` as well as the .s and .out files for information.
Backend is run with O1. I also left comments in the code about other versions that trap or not.
Especially close to the trap 3 versions are described that may expose the same or a different issue.

The executable should be able to reproduce this on a gfx90a with a dynamic build of LLVM/OpenMP offloading.

`LIBOMPTARGET_JIT_OPT_LEVEL=1 LIBOMPTARGET_JIT_REPLACEMENT_MODULE=trapping_close.ll LIBOMPTARGET_JIT_SKIP_OPT=1 ./check_spo_batched_reduction -n 1 -s 1 -w 2`

Found while reducing #60937

[repro_close.tar.gz](https://github.com/llvm/llvm-project/files/10974657/repro_close.tar.gz)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Wrong code depending on placement of a multiplication #61422

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AMDGPU] Wrong code depending on placement of a multiplication #61422

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions