Skip to content

[GHA] Uplift Linux GPU RT version to 23.09.25812.14 #9196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 15, 2023

Conversation

bb-sycl
Copy link
Contributor

@bb-sycl bb-sycl commented Apr 25, 2023

Scheduled drivers uplift

@bb-sycl bb-sycl requested a review from a team as a code owner April 25, 2023 03:07
@bb-sycl bb-sycl temporarily deployed to aws April 25, 2023 05:38 — with GitHub Actions Inactive
@bb-sycl bb-sycl temporarily deployed to aws April 25, 2023 07:53 — with GitHub Actions Inactive
@cperkinsintel
Copy link
Contributor

On L0 and OpenCL GPU we have a test unexpectedly passing. That's fine.

On both L0 and OpenCL GPU, the Printf/char.cpp test is failing. Does anyone know anything about that?

And the L0 queue priority/profling tests are failing. Is this known? Or need to be addressed?

********************
Failed Tests (3):
  SYCL :: Plugin/level_zero_queue_priority.cpp
  SYCL :: Plugin/level_zero_queue_profiling.cpp
  SYCL :: Printf/char.cpp

********************
Unexpectedly Passed Tests (1):
  SYCL :: ESIMD/thread_id_test.cpp

@smaslov-intel
Copy link
Contributor

And the L0 queue priority/profling tests are failing. Is this known? Or need to be addressed?

I am trying to address them in #9166

Check file: /__w/llvm/llvm/llvm/sycl/test-e2e/Plugin/level_zero_queue_priority.cpp
Input was:
<<<<<<
1: ZE ---> zeInit(0)
check:20'0 X~~~~~~~~~~~~~~~~~ error: no match found
2: zeInit: Level Zero initialization failure

@bader
Copy link
Contributor

bader commented May 16, 2023

And the L0 queue priority/profling tests are failing. Is this known? Or need to be addressed?

I am trying to address them in #9166

Check file: /__w/llvm/llvm/llvm/sycl/test-e2e/Plugin/level_zero_queue_priority.cpp
Input was:
<<<<<<
1: ZE ---> zeInit(0)
check:20'0 X~~~~~~~~~~~~~~~~~ error: no match found
2: zeInit: Level Zero initialization failure

@smaslov-intel, do you have any updates on this? I see that #9166 is merged.

@smaslov-intel
Copy link
Contributor

@bader : yes, it is merged and I expected the following GPU driver update to be successful. however I see this in a yesterday's attempt: https://github.com/intel/llvm/actions/runs/4987433999/jobs/8929192727

Any idea?

@bader
Copy link
Contributor

bader commented May 17, 2023

@bader : yes, it is merged and I expected the following GPU driver update to be successful. however I see this in a yesterday's attempt: https://github.com/intel/llvm/actions/runs/4987433999/jobs/8929192727

Any idea?

The job fails to push a branch with the driver version update change because the branch already exists. Please, merge the fix to the https://github.com/intel/llvm/tree/ci/update_gpu_driver-linux-23.09.25812.14 branch.

@bader bader temporarily deployed to aws May 17, 2023 02:35 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws May 17, 2023 03:07 — with GitHub Actions Inactive
@bader
Copy link
Contributor

bader commented May 17, 2023


Failed Tests (1):
SYCL :: Printf/char.cpp


Unexpectedly Passed Tests (1):
SYCL :: ESIMD/thread_id_test.cpp

@smaslov-intel
Copy link
Contributor

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

Unexpectedly Passed Tests (1):
SYCL :: ESIMD/thread_id_test.cpp

@v-klochkov : can you see if this is a reasonable new pass?

@v-klochkov
Copy link
Contributor

Unexpectedly Passed Tests (1):
SYCL :: ESIMD/thread_id_test.cpp

@v-klochkov : can you see if this is a reasonable new pass

Yes, we were waiting for GPU driver update to enable that test.

@smaslov-intel
Copy link
Contributor

Unexpectedly Passed Tests (1):
SYCL :: ESIMD/thread_id_test.cpp

@v-klochkov : can you see if this is a reasonable new pass

Yes, we were waiting for GPU driver update to enable that test.

Great, thanks for confirming.

@bader
Copy link
Contributor

bader commented May 30, 2023

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

@smaslov-intel, do you have any updates on this issue? In case it might be helpful: this issue appeared with 22.49.* driver (see #8156 (comment)).

@bader bader requested a review from a team as a code owner June 14, 2023 19:29
@bader bader requested a review from KseniyaTikhomirova June 14, 2023 19:29
@bader bader temporarily deployed to aws June 14, 2023 20:16 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws June 14, 2023 20:49 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws June 14, 2023 21:55 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws June 14, 2023 22:35 — with GitHub Actions Inactive
@jandres742
Copy link
Contributor

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

@smaslov-intel, do you have any updates on this issue? In case it might be helpful: this issue appeared with 22.49.* driver (see #8156 (comment)).

this has been fixed in IGC. We need to wait for fix to be promoted to compute-runtime GPU repo to use it.

@bader bader temporarily deployed to aws June 14, 2023 23:42 — with GitHub Actions Inactive
@bader
Copy link
Contributor

bader commented Jun 14, 2023

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

@smaslov-intel, do you have any updates on this issue? In case it might be helpful: this issue appeared with 22.49.* driver (see #8156 (comment)).

this has been fixed in IGC. We need to wait for fix to be promoted to compute-runtime GPU repo to use it.

I'd like to update the driver ASAP as old version gating some important changes in DPC++ compiler like switching to opaque pointers. The risk to update to 23.09.* version is that we disable Printf/char.cpp test and we might miss a regression while waiting for the fixed new version. I talked to @smaslov-intel and we agreed to take that risk.
@jandres742, does it sound okay to you?

@jandres742
Copy link
Contributor

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

@smaslov-intel, do you have any updates on this issue? In case it might be helpful: this issue appeared with 22.49.* driver (see #8156 (comment)).

this has been fixed in IGC. We need to wait for fix to be promoted to compute-runtime GPU repo to use it.

I'd like to update the driver ASAP as old version gating some important changes in DPC++ compiler like switching to opaque pointers. The risk to update to 23.09.* version is that we disable Printf/char.cpp test and we might miss a regression while waiting for the fixed new version. I talked to @smaslov-intel and we agreed to take that risk. @jandres742, does it sound okay to you?

thanks @bader. +1 on my side.

@bader bader temporarily deployed to aws June 15, 2023 00:24 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws June 15, 2023 03:06 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws June 15, 2023 03:58 — with GitHub Actions Inactive
@bader
Copy link
Contributor

bader commented Jun 15, 2023

@intel/llvm-reviewers-runtime, @KseniyaTikhomirova, ping.

@v-klochkov v-klochkov merged commit 52a0fc8 into sycl Jun 15, 2023
@v-klochkov v-klochkov deleted the ci/update_gpu_driver-linux-23.09.25812.14 branch June 15, 2023 15:36
@smaslov-intel
Copy link
Contributor

Failed Tests (1):
SYCL :: Printf/char.cpp

I can reproduce fail locally it is a segfault in L0 RT. Following up with @jandres742

@smaslov-intel, do you have any updates on this issue? In case it might be helpful: this issue appeared with 22.49.* driver (see #8156 (comment)).

this has been fixed in IGC. We need to wait for fix to be promoted to compute-runtime GPU repo to use it.

I'd like to update the driver ASAP as old version gating some important changes in DPC++ compiler like switching to opaque pointers. The risk to update to 23.09.* version is that we disable Printf/char.cpp test and we might miss a regression while waiting for the fixed new version. I talked to @smaslov-intel and we agreed to take that risk. @jandres742, does it sound okay to you?

thanks @bader. +1 on my side.

@bader : did you create an issue to re-enable the test when the fixed driver is available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants