Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

[SYCL] Check that fp16 aspect is supported before using half #1487

Merged
merged 2 commits into from
Jan 5, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions SYCL/Regression/half_operators.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple -fsycl-unnamed-lambda %s -o %t.out
// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple -fsycl-device-code-split=per_kernel -fsycl-unnamed-lambda %s -o %t.out
// RUN: %GPU_RUN_PLACEHOLDER %t.out
// REQUIRES: gpu
#include <iostream>
Expand Down Expand Up @@ -26,9 +26,7 @@ template <typename T> bool are_bitwise_equal(T lhs, T rhs) {
return result;
}

template <typename T> bool test() {
sycl::queue queue{};

template <typename T> bool test(sycl::queue &queue) {
constexpr int NumElems{32};
bool pass{true};

Expand Down Expand Up @@ -69,8 +67,10 @@ template <typename T> bool test() {
}

int main(int argc, char **argv) {
sycl::queue queue{};
bool passed = true;
passed &= test<float>();
passed &= test<sycl::half>();
passed &= test<float>(queue);
if (queue.get_device().has(sycl::aspect::fp16))
passed &= test<sycl::half>(queue);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the test compilation options need -fsycl-device-code-split=per_kernel to handle the aspects properly at runtime.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know that, thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per_kernel device code split is not actually needed anymore since intel/llvm#7302: the compiler is now able to automatically split kernels into separate modules if optional features are used

return passed ? 0 : 1;
}