Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AmdNvidiaJIT/kernel_and_bundle.cpp failing on HIP AMD #14989

Closed
sarnex opened this issue Aug 7, 2024 · 5 comments
Closed

AmdNvidiaJIT/kernel_and_bundle.cpp failing on HIP AMD #14989

sarnex opened this issue Aug 7, 2024 · 5 comments
Labels
bug Something isn't working confirmed hip Issues related to execution on HIP backend.

Comments

@sarnex
Copy link
Contributor

sarnex commented Aug 7, 2024

Describe the bug

FAIL: SYCL :: AmdNvidiaJIT/kernel_and_bundle.cpp (75 of 2129)
******************** TEST 'SYCL :: AmdNvidiaJIT/kernel_and_bundle.cpp' FAILED ********************
Exit Code: 1
Command Output (stdout):
--
# RUN: at line 3
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp -fsycl-embed-ir -o /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp -fsycl-embed-ir -o /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# note: command had no output on stdout or stderr
# RUN: at line 4
env SYCL_JIT_AMDGCN_PTX_KERNELS=1 env SYCL_JIT_COMPILER_DEBUG="sycl-spec-const-materializer" env ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt ; /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp --input-file /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt
# executed command: env SYCL_JIT_AMDGCN_PTX_KERNELS=1 env SYCL_JIT_COMPILER_DEBUG=sycl-spec-const-materializer env ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# .---redirected output from '/__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt'
# | <HIP>[ERROR]: 
# | UR HIP ERROR:
# | 	Value:           209
# | 	Name:            hipErrorNoBinaryForGpu
# | 	Description:     no kernel image is available for execution on the device
# | 	Function:        buildProgram
# | 	Source Location: /__w/llvm/llvm/build/_deps/unified-runtime-src/source/adapters/hip/program.cpp:235
# | 
# | terminate called after throwing an instance of 'sycl::_V1::exception'
# |   what():  The program was built for 1 devices
# | Build program log for 'AMD Radeon RX 6700 XT':
# | ��v�
# `-----------------------------
# note: command had no output on stdout or stderr
# error: command failed with exit status: -6
# executed command: /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp --input-file /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt
# .---command stderr------------
# | /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp:28:11: error: CHECK: expected string not found in input
# | // CHECK: Working on function:
# |           ^
# | /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt:1:1: note: scanning from here
# | <HIP>[ERROR]: 
# | ^
# | /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt:9:24: note: possible intended match here
# | terminate called after throwing an instance of 'sycl::_V1::exception'
# |                        ^
# | 
# | Input file: /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt
# | Check file: /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: <HIP>[ERROR]:  
# | check:28'0     X~~~~~~~~~~~~~~ error: no match found
# |             2: UR HIP ERROR: 
# | check:28'0     ~~~~~~~~~~~~~~
# |             3:  Value: 209 
# | check:28'0     ~~~~~~~~~~~~
# |             4:  Name: hipErrorNoBinaryForGpu 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             5:  Description: no kernel image is available for execution on the device 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             6:  Function: buildProgram 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~
# |             7:  Source Location: /__w/llvm/llvm/build/_deps/unified-runtime-src/source/adapters/hip/program.cpp:235 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             8:  
# | check:28'0     ~
# |             9: terminate called after throwing an instance of 'sycl::_V1::exception' 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | check:28'1                            ?                                               possible intended match
# |            10:  what(): The program was built for 1 devices 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            11: Build program log for 'AMD Radeon RX 6700 XT': 
# | check:28'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            12: ��v� 
# | check:28'0     ~~~~~
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1
--

To reproduce

No response

Environment

No response

Additional context

No response

@sarnex sarnex added bug Something isn't working hip Issues related to execution on HIP backend. labels Aug 7, 2024
sarnex added a commit that referenced this issue Aug 7, 2024
It's failing, see [here](#14989)

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
@sarnex
Copy link
Contributor Author

sarnex commented Aug 7, 2024

@jchlanda FYI

@sarnex
Copy link
Contributor Author

sarnex commented Aug 8, 2024

Note: It's passing in the nightly somehow

XPASS: SYCL :: AmdNvidiaJIT/kernel_and_bundle.cpp (76 of 2129)
******************** TEST 'SYCL :: AmdNvidiaJIT/kernel_and_bundle.cpp' FAILED ********************
Exit Code: 0

Command Output (stdout):
--
# RUN: at line 6
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp -fsycl-embed-ir -o /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp -fsycl-embed-ir -o /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# note: command had no output on stdout or stderr
# RUN: at line 7
env SYCL_JIT_AMDGCN_PTX_KERNELS=1 env SYCL_JIT_COMPILER_DEBUG="sycl-spec-const-materializer" env ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt ; /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp --input-file /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt
# executed command: env SYCL_JIT_AMDGCN_PTX_KERNELS=1 env SYCL_JIT_COMPILER_DEBUG=sycl-spec-const-materializer env ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.out
# .---redirected output from '/__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt'
# | Working on function:
# | ==================
# | _ZTSN4sycl3_V16detail19__pf_kernel_wrapperIZ15runKernelBundleNS0_5queueERSt6vectorIiSaIiEES7_E10WoofBundleEE
# | 
# | Replaced: 2 loads from spec const buffer.
# | Load to global variable mappings:
# | 	Load:
# |   %load = load i32, ptr %bc, align 4
# | 	Global Variable:
# | @SpecConsBlob__ZTSN4sycl3_V16detail19__pf_kernel_wrapperIZ15runKernelBundleNS0_5queueERSt6vectorIiSaIiEES7_E10WoofBundleEE_0 = weak_odr addrspace(4) constant i32 11
# | 
# | 	Load:
# |   %load3 = load [2 x i32], ptr %bc2, align 4
# | 	Global Variable:
# | @SpecConsBlob__ZTSN4sycl3_V16detail19__pf_kernel_wrapperIZ15runKernelBundleNS0_5queueERSt6vectorIiSaIiEES7_E10WoofBundleEE_1 = weak_odr addrspace(4) constant [2 x i32] [i32 13, i32 17]
# | 
# | 
# | 
# | Working on function:
# | ==================
# | _ZTSZ15runKernelBundleN4sycl3_V15queueERSt6vectorIiSaIiEES5_E10WoofBundle
# | 
# | Replaced: 2 loads from spec const buffer.
# | Load to global variable mappings:
# | 	Load:
# |   %load = load i32, ptr %bc, align 4
# | 	Global Variable:
# | @SpecConsBlob__ZTSZ15runKernelBundleN4sycl3_V15queueERSt6ve
# | ...
# `---data was truncated--------
# note: command had no output on stdout or stderr
# error: command failed with exit status: -6
# executed command: /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp --input-file /__w/llvm/llvm/build-e2e/AmdNvidiaJIT/Output/kernel_and_bundle.cpp.tmp.txt
# note: command had no output on stdout or stderr

--

sarnex added a commit that referenced this issue Aug 8, 2024
It
[passes](https://github.com/intel/llvm/actions/runs/10295214614/job/28494750636)
in the nightly somehow but fails in postcommit.

#14989

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
@DuncanMcBain
Copy link

#14801 indicated that this issue was fixed, can we close this issue?

@sarnex sarnex closed this as completed Sep 11, 2024
@DuncanMcBain
Copy link

Thank you @sarnex !

@sarnex
Copy link
Contributor Author

sarnex commented Sep 11, 2024

sure :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed hip Issues related to execution on HIP backend.
Projects
None yet
Development

No branches or pull requests

2 participants