-
Notifications
You must be signed in to change notification settings - Fork 771
[SYCL][CUDA] Fix context scope in kernel launch #4606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The `guessLocalWorkSize` function uses the CUDA API so it needs an active context, and there was no active `ScopedContext` when it was called which may cause issue. This fixes intel#2777
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Beware that #4401 also implements this by putting the ScopedContext
inside guessLocalWorkSize
. I do not have a preference to whether or not it should be at API top-level or inside guessLocalWorkSize
, but having it both places is redundant. Ping @JackAKirk for awareness.
Looks fine to me. You might want to wrap the ScopedContext into a try/catch block in case it throws PI_INVALID_CONTEXT, that can then be returned as pi_result in cuda_piEnqueueKernelLaunch; since this appears to be the standard practice in the file. |
Good catch, I think we should try to keep the
Good point, I'll extend the |
55f2f4f
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
* upstream/sycl: (2344 commits) [ESIMD] Rename slm_load4/slm_store4 to slm_load_rgba/slm_store_rgba (intel#4158) [SYCL] Avoid re-computing group_range in nd_item::get_group_range() (intel#4621) [clang-offload-extract] Ignore zero padding in .tgting section (intel#4622) [Driver][SYCL] Fix -fsycl-help output when redirected (intel#4619) [Driver][SYCL][FPGA] Do not unbundle aoco as an archive for hardware (intel#4477) [Driver][SYCL] Fix offload-bundler and offload-deps triples (intel#4616) [SYCL] Fix bit_cast for half type (intel#4603) [SYCL] Fix a typo in accessor::get_range method (intel#4556) [SYCL] Detach allocas from their dependencies regardless of linked alloca presence (intel#4573) [SYCL][L0] Make sure that we only query/sync host-visible events from the host. (intel#4613) Fix tests with wrong alias metadata [Driver][SYCL] Fixup arguments to llc call for PIC and code-model (intel#4614) [SYCL][L0] Add ownership control for LeveL-Zero kernel_bundle interop. (intel#4576) [SYCL][Driver] Expose llvm-foreach --jobs functionality through a driver option (intel#4543) [SYCL] Prevent stream buffer leak on constructor exception (intel#4594) [ESIMD] Replace mask_type_t with simd_mask to represent Gen predicates. (intel#4230) Fix for a bunch of fixed point integer SPIR-V instructions (intel#1213) Amend SingleElementVectorINTEL decoration use cases according to spec update (intel#1192) Enforce UserSemantic decoration if no FPGA decorations found [SYCL][CUDA] Fix context scope in kernel launch (intel#4606) ...
The
guessLocalWorkSize
function uses the CUDA API so it needs anactive context, and there was no active
ScopedContext
when it wascalled which may cause issue.
This fixes #2777