Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] Test 'Assert/assert_in_multiple_tus.cpp' CI Failure #8832

Open
andylshort opened this issue Mar 28, 2023 · 0 comments
Open

[CUDA] Test 'Assert/assert_in_multiple_tus.cpp' CI Failure #8832

andylshort opened this issue Mar 28, 2023 · 0 comments
Labels
bug Something isn't working cuda CUDA back-end

Comments

@andylshort
Copy link

Describe the bug
Post-commit CI CUDA E2E test Assert/assert_in_multiple_tus.cpp fails:

FAIL: SYCL :: Assert/assert_in_multiple_tus.cpp (20 of 1270)
******************** TEST 'SYCL :: Assert/assert_in_multiple_tus.cpp' FAILED ********************
Script:
--
: 'RUN: at line 6';    /__w/llvm/llvm/toolchain/bin/clang++   -DSYCL_FALLBACK_ASSERT=1 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -I /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp -o /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out
: 'RUN: at line 7';   true /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.cpu.txt || true
: 'RUN: at line 8';   true FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp --input-file /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.cpu.txt
: 'RUN: at line 9';    env ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1  /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt || true
: 'RUN: at line 10';    env ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1  FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp --input-file /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt
: 'RUN: at line 12';   true /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.acc.txt
: 'RUN: at line 13';   true FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp --check-prefix=CHECK-ACC --input-file /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.acc.txt
--
Exit Code: 1

Command Output (stdout):
--
$ ":" "RUN: at line 6"
note: command had no output on stdout or stderr
$ "/__w/llvm/llvm/toolchain/bin/clang++" "-DSYCL_FALLBACK_ASSERT=1" "-fsycl" "-fsycl-targets=nvptx64-nvidia-cuda" "-I" "/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs" "/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp" "/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp" "-o" "/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out"
# command stderr:
clang++: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]

$ ":" "RUN: at line 7"
note: command had no output on stdout or stderr
$ "true" "/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out"
note: command had no output on stdout or stderr
$ ":" "RUN: at line 8"
note: command had no output on stdout or stderr
$ "true" "FileCheck" "/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp" "--input-file" "/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.cpu.txt"
note: command had no output on stdout or stderr
$ ":" "RUN: at line 9"
note: command had no output on stdout or stderr
$ "env" "ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu" "SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1" "/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out"
# redirected output from '/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt':

PI CUDA ERROR:
	Value:           710
	Name:            CUDA_ERROR_ASSERT
	Description:     device-side assert triggered
	Function:        build_program
	Source Location: /__w/llvm/llvm/src/sycl/plugins/cuda/pi_cuda.cpp:776


PI CUDA ERROR:
	Value:           400
	Name:            CUDA_ERROR_INVALID_HANDLE
	Description:     invalid resource handle
	Function:        cuda_piProgramRelease
	Source Location: /__w/llvm/llvm/src/sycl/plugins/cuda/pi_cuda.cpp:3600

terminate called after throwing an instance of 'sycl::_V1::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'NVIDIA A10G':
 -999 (Unknown PI error)

note: command had no output on stdout or stderr
error: command failed with exit status: -6
$ "true"
note: command had no output on stdout or stderr
$ ":" "RUN: at line 10"
note: command had no output on stdout or stderr
$ "env" "ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu" "SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1" "FileCheck" "/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp" "--input-file" "/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt"
# command stderr:
/__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp:17:11: error: CHECK: expected string not found in input
// CHECK: {{.*}}kernels_in_file2.cpp:15: int calculus(int): {{global id: \[5|block: \[1}},0,0], {{local id|thread}}: [1,0,0]
          ^
/__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt:1:1: note: scanning from here

^

Input file: /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.gpu.txt
Check file: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1:  
check:17     X error: no match found
          2: PI CUDA ERROR: 
check:17     ~~~~~~~~~~~~~~~
          3:  Value: 710 
check:17     ~~~~~~~~~~~~
          4:  Name: CUDA_ERROR_ASSERT 
check:17     ~~~~~~~~~~~~~~~~~~~~~~~~~
          5:  Description: device-side assert triggered 
check:17     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          6:  Function: build_program 
check:17     ~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>

error: command failed with exit status: 1

To Reproduce
Run any pre-merge checks on current PRs. It fails in one of my PRs, run history and log files available here: https://github.com/intel/llvm/actions/runs/4542034482/jobs/8005455354?pr=8825

Environment (please complete the following information):

  • OS: Linux
  • Target device and vendor: AWS Node
  • DPC++ version: Latest

Additional context
N/A.

@andylshort andylshort added the bug Something isn't working label Mar 28, 2023
@maarquitos14 maarquitos14 added the cuda CUDA back-end label Apr 4, 2023
npmiller referenced this issue Jan 31, 2024
A few tests in the driver area require amdgpu or nvptx targets to be
built in order to properly run. Add these requirements to the tests.
aelovikov-intel referenced this issue Feb 1, 2024
All GitHub Actions workflows added by intel/llvm project are expected to
use following naming notation:

1. Name starts with `sycl` prefix.
2. Use dash `-` to separate words (instead of underscore `_`).

This patches fixes naming of workflows which do not follow this
notation.
aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this issue Feb 1, 2024
steffenlarsen pushed a commit that referenced this issue Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda CUDA back-end
Projects
None yet
Development

No branches or pull requests

2 participants