-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][ESIMD][EMU] Handle intrinsic operations promoted to 4-byte element type #5727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][ESIMD][EMU] Handle intrinsic operations promoted to 4-byte element type #5727
Conversation
…lement type - scatter_impl() and gather_impl() in memory.hpp promote argument/return vector to 4-byte elements - ESIMD_EMULATOR backend handles this promoted vectors
This PR replaces #5702 |
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
…ntrin.hpp Co-authored-by: kbobrovs <Konstantin.S.Bobrovsky@intel.com>
…ntrin.hpp Co-authored-by: kbobrovs <Konstantin.S.Bobrovsky@intel.com>
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
@@ -646,6 +657,14 @@ __esimd_gather_masked_scaled2(SurfIndAliasTy surf_ind, uint32_t global_offset, | |||
{ | |||
static_assert(Scale == 0); | |||
|
|||
// ActualTy - Reverse of 'PromoT' in memory.hpp | |||
using ActualTy = std::conditional_t< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
sycl/include/sycl/ext/intel/experimental/esimd/detail/memory_intrin.hpp
Outdated
Show resolved
Hide resolved
- clang-format fix - Replacing ActualTy with OrigTyAsInt - 'gather' implementation
…romot_memory_intrinsic
@alexbatashev, take a look at https://github.com/intel/llvm/actions/runs/1966876941.
Do you have any clues what might cause this error? I don't see any changes in workflow files, so it's strange that GitHub Actions is not able to parse them for this particular PR. I'll restart GHA pre-commit to get complete validation results. |
…romot_memory_intrinsic
Looks like there was an attempt to restart a single job. That feature was released today, and seems to be broken. |
…romot_memory_intrinsic
This reverts commit 5913d06.
- Intermediate vectors are added to contain type-adjusted values bridging between 'RestoredTy' (restored with TySizeLog2) and 'Ty'
@@ -449,8 +469,9 @@ __esimd_scatter_scaled(__ESIMD_DNS::simd_mask_storage_t<N> pred, | |||
|
|||
for (int idx = 0; idx < N; idx++) { | |||
if (pred[idx]) { | |||
Ty *addr = reinterpret_cast<Ty *>(elem_offsets[idx] + writeBase); | |||
*addr = vals[idx]; | |||
RestoredTy *addr = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For later: please check with CM library devs that acquiring a mutex is really needed when accessing through a cm buffer. That seems really strange requirement. It is also strange that no mutex is needed when accessing SLM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check with CM Lib devs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue ticket : #5863
return __ESIMD_DNS::bitcast<Ty, RestoredTy, N>(retv); | ||
} else { | ||
__ESIMD_DNS::vector_type_t<Ty, N> Adjusted = 0; | ||
// TODO : Built-in function?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see convert_vector above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
/verify with intel/llvm-test-suite#928 |
1 similar comment
/verify with intel/llvm-test-suite#928 |
@vladimirlaz , would you help me run CI for this PR with following intel/llvm-test-suite PR? XFAIL markings need to be removed in order to prevent 'Unexpected passes' caused by this PR. |
@dongkyunahn-intel, I see clang FE crashes on these sometimes: Could you please file an internal ticket on FE? Do you know if this is flaky failure? |
It looks like flaky failures. My another PR (#5606) does not show those failures. |
These failures are gone after retrying failing tests. Now there are four unexpected passes (to be resolved with intel/llvm-test-suite#928) and one timeout (BitonicSortkv2). |
/verify |
CI fails due to unexpected pass from opencl. There is a pending PR for fixing this in intel/llvm - intel/llvm-test-suite#944. |
The only failure is unrelated to the changes: |
scatter_impl()
andgather_impl()
in memory.hpp promote argument/return vector to 4-byte elements