[SYCL][CUDA] Fix generating permute bytes from register pair when the initial values are undefined. #12068

mmoadeli · 2023-12-04T20:07:09Z

When generating the permute bytes for the prmt instruction, the existence of an undefined initial value initialises the int32 that holds the mask with all 1's (0xFFFFFFFF). That initialization subsequently leads to complications during the subsequent OR operation, leading to inaccuracies in populating mask values for the following bytes. Consequently, the final value persists as a constant -1, irrespective of the actual mask values that succeed the initial set value.

…s are undefined(-1).

AlexeySachkov

Should we push this directly to the upstream llvm/llvm-project?

mmoadeli · 2023-12-05T09:09:04Z

Should we push this directly to the upstream llvm/llvm-project?

I can push to llvm/llvm-project. It usually takes ages that they act, though.

bader · 2023-12-05T15:18:24Z

Should we push this directly to the upstream llvm/llvm-project?

I can push to llvm/llvm-project. It usually takes ages that they act, though.

This is the way.

#11840 is resolved by usptreaming the fix in #12068 to llvm/llvm-project#74437.

npmiller · 2024-03-06T10:13:19Z

Closing this as it was submitted to llvm-project here:

[CodeGen] Fix generating permute bytes from register pair when the initial values are undefined llvm/llvm-project#74437

Fix generating permute bytes from register pair when the intial value…

98c0050

…s are undefined(-1).

mmoadeli requested review from a team as code owners December 4, 2023 20:07

mmoadeli requested review from bso-intel and AlexeySachkov and removed request for bso-intel December 4, 2023 20:07

mmoadeli temporarily deployed to WindowsCILock December 4, 2023 20:09 — with GitHub Actions Inactive

mmoadeli mentioned this pull request Dec 4, 2023

Vector conversion does not work correctly on CUDA #11840

Closed

mmoadeli linked an issue Dec 4, 2023 that may be closed by this pull request

Vector conversion does not work correctly on CUDA #11840

Closed

mmoadeli temporarily deployed to WindowsCILock December 4, 2023 20:47 — with GitHub Actions Inactive

AlexeySachkov reviewed Dec 5, 2023

View reviewed changes

jsji pushed a commit that referenced this pull request Jan 19, 2024

Remove XFAIL as fixed by aa23e49

69c77fe

#11840 is resolved by usptreaming the fix in #12068 to llvm/llvm-project#74437.

npmiller closed this Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][CUDA] Fix generating permute bytes from register pair when the initial values are undefined. #12068

[SYCL][CUDA] Fix generating permute bytes from register pair when the initial values are undefined. #12068

Uh oh!

mmoadeli commented Dec 4, 2023

Uh oh!

AlexeySachkov left a comment

Uh oh!

mmoadeli commented Dec 5, 2023

Uh oh!

bader commented Dec 5, 2023

Uh oh!

npmiller commented Mar 6, 2024

Uh oh!

Uh oh!

[SYCL][CUDA] Fix generating permute bytes from register pair when the initial values are undefined. #12068

[SYCL][CUDA] Fix generating permute bytes from register pair when the initial values are undefined. #12068

Uh oh!

Conversation

mmoadeli commented Dec 4, 2023

Uh oh!

AlexeySachkov left a comment

Choose a reason for hiding this comment

Uh oh!

mmoadeli commented Dec 5, 2023

Uh oh!

bader commented Dec 5, 2023

Uh oh!

npmiller commented Mar 6, 2024

Uh oh!

Uh oh!