-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][ESIMD][EMU] ESIMD_CPU Kernel launch and ESIMD_EMU backend loading #4020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][ESIMD][EMU] ESIMD_CPU Kernel launch and ESIMD_EMU backend loading #4020
Conversation
* This PR is for enabling kernel launching for ESIMD_CPU * Also contains emulated intrinsics for memory operations * esimd_cpu backend is loaded in SYCL runtime * Base PR : intel#4011
- Replacing deprecated ATOMIC_* with atomic_op - Relocating atomic_add implementation for flat_atomic1 from slm_atomic1
- _pi_buffer and _pi_image are defined in 'cmrt_if_defs.hpp' which is used for both PI and kernel compilation - Dependency is resolved by adding another device interface call (sycl_get_cm_buffer_params, sycl_get_cm_image_params)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PI-related changes LGTM
I'll revert some changes (relocation + update) in following files and apply them in PR#4011 due to changes in Device interface.
|
- These changes are already applied in PR#4011
- esimd_emu_functions_v1.h / esimdcpu_device_interface.hpp - Reverted file changes are to be relocated in PR#4011
Revert is complete. |
- Address of mutex is used for fetching mutex - Interface change is from PR#4011
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PI related changes LGTM
Jenkins and buildbot jobs are failing due to rebase failure. I'll make single commit using these change sets and merge it to the |
- __esimd_surf_read/write
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except some "non-functional" comments".
(haven't reviewed logic of __esimd* builtins)
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
- Kernel build failure fix from fully-working branch - Changing Header file inclusion for ESIMD_CPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@romanovvlad , Thanks for approval. I'll resolve conflicts and resolve build failures. |
Buildbot job failed because of missing implementation of PI_API. PR containing PI_API implementation is under review (#4011). I'll try to merge this change set after the PR is merged into the SYCL branch. |
/summary:run |
…rnel_launch_memory_intrinsic
/summary:run |
@dongkyunahn-intel, can you please resolve the CI failures in Jenkins/Summary? |
I've been trying to resolve the failure, but is not successful yet. The failure seems to be segmentation fault (memory access violation) under windows environment and occurs only for one precision - |
It is found that the segmentation fault is because of violation against 16-byte memory access alignment required by XMM access. Currently, only warning message is printed out during compilation for cases where the requirement cannot be satisfied - llvm/sycl/include/CL/sycl/types.hpp Line 545 in 6761e73
This issue has to be addressed in a separate PR. |
PR for memory access alignment is posted : #4953 |
- Allowing access 'MKernel' from sycl::handler class - +Reverting unnecessary std:move for Kernel function
8ee0bd8
…rnel_launch_memory_intrinsic
/summary:run |
@romanovvlad , @kbobrovs , @s-kanaev , please review and approve this PR. Cause of failures from 'test_handler' is identified, and corresponding ticket was created & shared internally. |
Second review/approval request : @romanovvlad , @kbobrovs , @s-kanaev. |
/summary:run |
…rnel_launch_memory_intrinsic
/summary:run |
@kbobrovs, @v-klochkov , @smaslov-intel Could you please review and approve if the patch is OK for you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PI related changes LGTM
@kbobrovs , @v-klochkov , Please review and approve if this patch looks okay to you. |
@@ -91,7 +91,7 @@ device_filter::device_filter(const std::string &FilterString) { | |||
std::string Message = | |||
std::string("Invalid device filter: ") + FilterString + | |||
"\nPossible backend values are " | |||
"{host,opencl,level_zero,cuda,hip,*}.\n" | |||
"{host,opencl,level_zero,cuda,hip,esimd_emulator*}.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late comment: This line would be better as below:
"{host,opencl,level_zero,cuda,hip,esimd_emulator,*}.\n"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will apply this change as piggyback in another PR later.
intel#4020 introduced some kind of wrapper for SYCL kernels on host to be able to launch them on ESIMD emulator backend. Even though we had dropped that feature since then, we didn't remove corresponding `handler.hpp` modifications. This PR removes them. The main effect expected from this PR is compilation time improvement: those kernel wrappers are specialized by kernel name, meaning that there will be plenty of extra useless functions emitted during host compilation pass for every kernel in a program. This patch also uncovered a missing case in `HostKernel::InstantiateKernelOnHost` which we couldn't ever encounter because we transformed a host kernel to always accept `nd_item`, thus always skipping problematic `item` code path. This PR is not expected to introduce any functional changes, but since I'm not very familiar with the SYCL RT, I'm not entirely sure of that.
#4020 introduced some kind of wrapper for SYCL kernels on host to be able to launch them on ESIMD emulator backend. Even though we had dropped that feature since then, we didn't remove corresponding `handler.hpp` modifications. This PR removes them. The main effect expected from this PR is compilation time improvement: those kernel wrappers are specialized by kernel name, meaning that there will be plenty of extra useless functions emitted during host compilation pass for every kernel in a program. This patch also uncovered a missing case in `HostKernel::InstantiateKernelOnHost` which we couldn't ever encounter because we transformed a host kernel to always accept `nd_item`, thus always skipping problematic `item` code path. This PR is not expected to introduce any functional changes, but since I'm not very familiar with the SYCL RT, I'm not entirely sure of that.
This PR is for enabling kernel launching for ESIMD_CPU
Also contains emulated intrinsics for memory operations
esimd_cpu backend is loaded in SYCL runtime
Base PR : [SYCL][ESIMD][EMU] pi_esimd_cpu bringing up with CM library #4011