-
Notifications
You must be signed in to change notification settings - Fork 771
[SYCL] Switch to using blocking USM free for OpenCL GPU #4928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Whenever a kernel is enqueued on GPU, the GPU driver records the state of all USM pointers that might be used in an indirect fashion. Because of this, these pointers cannot be freed until the execution of the kernel is finished. This change addresses this problem for OpenCL by using a blocking version of free, while Level Zero already handles this by deferring USM release.
/summary:run |
Test failures are caused by an OpenCL CPU runtime bug, changing this pull request to draft for now. |
Since it'll be a while until OpenCL CPU runtime is uplifted in CI, I've limited the changes in this PR to OpenCL GPU. @smaslov-intel @againull please, take a look. |
@smaslov-intel Applied what we've discussed yesterday. Please, take a look. |
/verify with intel/llvm-test-suite#561 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/verify with intel/llvm-test-suite#561 |
* upstream/sycl: (725 commits) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078) [SYCL] Fix the test to not depend on a specific line. (intel#5092) [CI] Provide libclc targets to build and test (intel#5091) Fix build of `check-llvm-spirv` target after 8f8001a Force opt to use new pass manager in pr52289 test after c34d157 ...
* upstream/sycl: [CI] Add container users to video group (intel#5101) [CI] More typo fixes in Nightly build (intel#5088) Revert "[CI] Disable pack and upload steps (intel#5119)" (intel#5125) [SYCL] Translate ZE_RESULT_ERROR_INVALID_ARGUMENT error code from L0 RT (intel#5122) [SYCL][L0][Plugin] Call ZeCommandQueueCreate on demand (intel#5109) [SYCL] Switch to using blocking USM free for OpenCL GPU (intel#4928) [CI] Disable pack and upload steps (intel#5119) [SYCL] Disable submission of AssertInfoCopier for FPGA (intel#4780) [SYCL][SPIRV] Implement islessgreater with FOrdNotEqual instead (intel#5076) [SYCL] Fix typo in the name of the host-visible pool (intel#5073) [SYCL] Only call shutdown when DLL is being unloaded, not when process is terminating (intel#4983) [SYCL][CUDA][PI] Fix infinite loop when parallel_for range exceeds INT_MAX (intel#5095) [SYCL] Translate out-of-memory error codes from L0 RT (intel#5107) [SYCL] Fix a few warnings during build scripts configuration (intel#5082) [SYCL] Fix amdgpu openmp test (intel#5103) [SYCL] [FPGA] Create experimental headers for FPGA latency control (intel#5066) [SYCL][CUDA] Don't enqueue an event wait on same CUDA stream (intel#5099) Remove PR disable template (intel#5102) [BuildBot]Uplift CPU/FPGAEMU RT version (intel#5078)
Whenever a kernel is enqueued on GPU, the GPU driver records the state
of all USM pointers that might be used in an indirect fashion. Because
of this, these pointers cannot be freed until the execution of the kernel
is finished.
This change addresses this problem for OpenCL by using a blocking version
of free, while Level Zero already handles this by deferring USM release.
The change is temporarily limited to OpenCL GPU until a bug in OpenCL CPU
runtime is resolved.