Skip to content

[ESIMD] Add lsc_slm_block_load() with merging semantics #8552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

v-klochkov
Copy link
Contributor

@v-klochkov v-klochkov commented Mar 7, 2023

The new lsc_slm_block_load() has an additional operand 'old_values' that contains the values returned from the function if the predicate passed to it is 0.

The corresponding LIT test: intel/llvm-test-suite#1637

The new lsc_slm_block_load() has an additional operand 'old_values'
that contains the values returned from the function if the predicate
passed to it is 0.

Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
@v-klochkov v-klochkov requested a review from a team as a code owner March 7, 2023 04:08
@v-klochkov v-klochkov temporarily deployed to aws March 7, 2023 04:31 — with GitHub Actions Inactive
__ESIMD_NS::simd<uint32_t, N> offsets = offset;
return __esimd_lsc_load_slm<T, cache_hint::none, cache_hint::none,
_AddressScale, _ImmOffset, _DS, _VS, _Transposed,
N>(pred.data(), offsets.data());
AddressScale, ImmOffset, FDS, VS, _Transposed, N>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I'm missing something but shouldn't _Transposed be Transposed? I don't see _Transposed defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed now.
I re-started a LIT test before uploading for review, but forgot to rebuild the compiler after this last-minute change (removal of those underscores).

/// @tparam NElts is the number of elements to load per address.
/// @tparam DS is the data size.
/// @param offset is the zero-based offset for SLM buffer in bytes.
/// @param pred is the predicate; if it contains 0, then the actual load
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if it contains 0, then the actual load is not performed and the returned value is undefined" is this accurate? I thought it would be old_values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good catch. It a copy-paste error. Fixed now. Thank you.

…t being tested before 'git push'

Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
@v-klochkov v-klochkov temporarily deployed to aws March 7, 2023 18:10 — with GitHub Actions Inactive
@v-klochkov
Copy link
Contributor Author

These timeouts on Linux are unrelated to this PR:
Timed Out Tests (3):
SYCL :: DeviceLib/assert-aot.cpp
SYCL :: DeviceLib/assert.cpp
SYCL :: KernelFusion/pointer_arg_function.cpp

@v-klochkov
Copy link
Contributor Author

This ESIMD PR is guaranteed to be unrelated to any potential issues in CUDA CI.
CUDA CI has not even started in 10 hours. I am merging the fix now.

@v-klochkov v-klochkov merged commit 6c75e0e into intel:sycl Mar 8, 2023
@v-klochkov v-klochkov deleted the esimd_merge_intrinsics_p3_lsc_slm_block_load branch March 8, 2023 04:24
@v-klochkov v-klochkov temporarily deployed to aws March 9, 2023 12:24 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants