Skip to content

[SYCL] Throw an exception for unsupported aspect #6989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 10 additions & 39 deletions sycl/include/sycl/aspects.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,47 +12,18 @@
namespace sycl {
__SYCL_INLINE_VER_NAMESPACE(_V1) {

#define __SYCL_ASPECT(ASPECT, ID) ASPECT = ID,
#define __SYCL_ASPECT_DEPRECATED(ASPECT, ID, MESSAGE) \
ASPECT __SYCL2020_DEPRECATED(MESSAGE) = ID,
#define __SYCL_ASPECT_DEPRECATED_ALIAS(ASPECT, ID, MESSAGE) \
__SYCL_ASPECT_DEPRECATED(ASPECT, ID, MESSAGE)
enum class __SYCL_TYPE(aspect) aspect {
host = 0,
cpu = 1,
gpu = 2,
accelerator = 3,
custom = 4,
fp16 = 5,
fp64 = 6,
int64_base_atomics __SYCL2020_DEPRECATED("use atomic64 instead") = 7,
int64_extended_atomics __SYCL2020_DEPRECATED("use atomic64 instead") = 8,
image = 9,
online_compiler = 10,
online_linker = 11,
queue_profiling = 12,
usm_device_allocations = 13,
usm_host_allocations = 14,
usm_shared_allocations = 15,
usm_restricted_shared_allocations = 16,
usm_system_allocations = 17,
usm_system_allocator __SYCL2020_DEPRECATED(
"use usm_system_allocations instead") = usm_system_allocations,
ext_intel_pci_address = 18,
ext_intel_gpu_eu_count = 19,
ext_intel_gpu_eu_simd_width = 20,
ext_intel_gpu_slices = 21,
ext_intel_gpu_subslices_per_slice = 22,
ext_intel_gpu_eu_count_per_subslice = 23,
ext_intel_max_mem_bandwidth = 24,
ext_intel_mem_channel = 25,
usm_atomic_host_allocations = 26,
usm_atomic_shared_allocations = 27,
atomic64 = 28,
ext_intel_device_info_uuid = 29,
ext_oneapi_srgb = 30,
ext_oneapi_native_assert = 31,
host_debuggable = 32,
ext_intel_gpu_hw_threads_per_eu = 33,
ext_oneapi_cuda_async_barrier = 34,
ext_oneapi_bfloat16 = 35,
ext_intel_free_memory = 36,
#include <sycl/info/aspects.def>
#include <sycl/info/aspects_deprecated.def>
};
#undef __SYCL_ASPECT_DEPRECATED_ALIAS
#undef __SYCL_ASPECT_DEPRECATED
#undef __SYCL_ASPECT

} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
3 changes: 3 additions & 0 deletions sycl/include/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -752,6 +752,9 @@ static const uint8_t PI_DEVICE_BINARY_OFFLOAD_KIND_SYCL = 4;
#define __SYCL_PI_PROPERTY_SET_SYCL_EXPORTED_SYMBOLS "SYCL/exported symbols"
/// PropertySetRegistry::SYCL_DEVICE_GLOBALS defined in PropertySetIO.h
#define __SYCL_PI_PROPERTY_SET_SYCL_DEVICE_GLOBALS "SYCL/device globals"
/// PropertySetRegistry::SYCL_DEVICE_REQUIREMENTS defined in PropertySetIO.h
#define __SYCL_PI_PROPERTY_SET_SYCL_DEVICE_REQUIREMENTS \
"SYCL/device requirements"

/// Program metadata tags recognized by the PI backends. For kernels the tag
/// must appear after the kernel name.
Expand Down
35 changes: 35 additions & 0 deletions sycl/include/sycl/info/aspects.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
__SYCL_ASPECT(host, 0)
__SYCL_ASPECT(cpu, 1)
__SYCL_ASPECT(gpu, 2)
__SYCL_ASPECT(accelerator, 3)
__SYCL_ASPECT(custom, 4)
__SYCL_ASPECT(fp16, 5)
__SYCL_ASPECT(fp64, 6)
__SYCL_ASPECT(image, 9)
__SYCL_ASPECT(online_compiler, 10)
__SYCL_ASPECT(online_linker, 11)
__SYCL_ASPECT(queue_profiling, 12)
__SYCL_ASPECT(usm_device_allocations, 13)
__SYCL_ASPECT(usm_host_allocations, 14)
__SYCL_ASPECT(usm_shared_allocations, 15)
__SYCL_ASPECT(usm_restricted_shared_allocations, 16)
__SYCL_ASPECT(usm_system_allocations, 17)
__SYCL_ASPECT(ext_intel_pci_address, 18)
__SYCL_ASPECT(ext_intel_gpu_eu_count, 19)
__SYCL_ASPECT(ext_intel_gpu_eu_simd_width, 20)
__SYCL_ASPECT(ext_intel_gpu_slices, 21)
__SYCL_ASPECT(ext_intel_gpu_subslices_per_slice, 22)
__SYCL_ASPECT(ext_intel_gpu_eu_count_per_subslice, 23)
__SYCL_ASPECT(ext_intel_max_mem_bandwidth, 24)
__SYCL_ASPECT(ext_intel_mem_channel, 25)
__SYCL_ASPECT(usm_atomic_host_allocations, 26)
__SYCL_ASPECT(usm_atomic_shared_allocations, 27)
__SYCL_ASPECT(atomic64, 28)
__SYCL_ASPECT(ext_intel_device_info_uuid, 29)
__SYCL_ASPECT(ext_oneapi_srgb, 30)
__SYCL_ASPECT(ext_oneapi_native_assert, 31)
__SYCL_ASPECT(host_debuggable, 32)
__SYCL_ASPECT(ext_intel_gpu_hw_threads_per_eu, 33)
__SYCL_ASPECT(ext_oneapi_cuda_async_barrier, 34)
__SYCL_ASPECT(ext_oneapi_bfloat16, 35)
__SYCL_ASPECT(ext_intel_free_memory, 36)
4 changes: 4 additions & 0 deletions sycl/include/sycl/info/aspects_deprecated.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
__SYCL_ASPECT_DEPRECATED(int64_base_atomics, 7, "use atomic64 instead")
__SYCL_ASPECT_DEPRECATED(int64_extended_atomics, 8, "use atomic64 instead")
// Special macro for aspects that don't have own token
__SYCL_ASPECT_DEPRECATED_ALIAS(usm_system_allocator, usm_system_allocations, "use usm_system_allocations instead")
Comment on lines +1 to +4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd personally avoid having two files, but I don't have strong argumentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steffenlarsen , I'm not sure if you're "thumbing up" my "I'd avoid" or "don't have strong argumentation".

I'm approving the PR to unblock @KornevNikita but if you'd prefer to have a single file as well we can ask him to address that :) Both pre-/post-commit would work for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both, really. Having a single file would be a little nicer, but I don't think it's worth sweating over. 😄

1 change: 1 addition & 0 deletions sycl/include/sycl/stl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include <memory>
#include <mutex>
#include <string>
#include <string_view>
#include <vector>

namespace sycl {
Expand Down
1 change: 1 addition & 0 deletions sycl/source/detail/device_binary_image.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ void RTDeviceBinaryImage::init(pi_device_binary Bin) {
ProgramMetadata.init(Bin, __SYCL_PI_PROPERTY_SET_PROGRAM_METADATA);
ExportedSymbols.init(Bin, __SYCL_PI_PROPERTY_SET_SYCL_EXPORTED_SYMBOLS);
DeviceGlobals.init(Bin, __SYCL_PI_PROPERTY_SET_SYCL_DEVICE_GLOBALS);
DeviceRequirements.init(Bin, __SYCL_PI_PROPERTY_SET_SYCL_DEVICE_REQUIREMENTS);
}

DynRTDeviceBinaryImage::DynRTDeviceBinaryImage(
Expand Down
4 changes: 4 additions & 0 deletions sycl/source/detail/device_binary_image.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,9 @@ class RTDeviceBinaryImage {
const PropertyRange &getProgramMetadata() const { return ProgramMetadata; }
const PropertyRange &getExportedSymbols() const { return ExportedSymbols; }
const PropertyRange &getDeviceGlobals() const { return DeviceGlobals; }
const PropertyRange &getDeviceRequirements() const {
return DeviceRequirements;
}

protected:
void init(pi_device_binary Bin);
Expand All @@ -207,6 +210,7 @@ class RTDeviceBinaryImage {
RTDeviceBinaryImage::PropertyRange ProgramMetadata;
RTDeviceBinaryImage::PropertyRange ExportedSymbols;
RTDeviceBinaryImage::PropertyRange DeviceGlobals;
RTDeviceBinaryImage::PropertyRange DeviceRequirements;
};

// Dynamically allocated device binary image, which de-allocates its binary
Expand Down
56 changes: 50 additions & 6 deletions sycl/source/detail/program_manager/program_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include <detail/program_impl.hpp>
#include <detail/program_manager/program_manager.hpp>
#include <detail/spec_constant_impl.hpp>
#include <sycl/aspects.hpp>
#include <sycl/backend_types.hpp>
#include <sycl/context.hpp>
#include <sycl/detail/common.hpp>
Expand Down Expand Up @@ -543,14 +544,57 @@ RT::PiProgram ProgramManager::getBuiltPIProgram(

DeviceImplPtr Dev =
(MustBuildOnSubdevice == PI_TRUE) ? DeviceImpl : RootDevImpl;
auto BuildF = [this, &M, &KSId, &ContextImpl, &Dev, Prg, &CompileOpts,
&LinkOpts, &JITCompilationIsRequired, SpecConsts] {
auto Context = createSyclObjFromImpl<context>(ContextImpl);
auto Device = createSyclObjFromImpl<device>(Dev);
auto Context = createSyclObjFromImpl<context>(ContextImpl);
auto Device = createSyclObjFromImpl<device>(Dev);
const RTDeviceBinaryImage &Img =
getDeviceImage(M, KSId, Context, Device, JITCompilationIsRequired);

// Check that device supports all aspects used by the kernel
const RTDeviceBinaryImage::PropertyRange &ARange =
Img.getDeviceRequirements();

#define __SYCL_ASPECT(ASPECT, ID) \
case aspect::ASPECT: \
return #ASPECT;
#define __SYCL_ASPECT_DEPRECATED(ASPECT, ID, MESSAGE) __SYCL_ASPECT(ASPECT, ID)
// We don't need "case aspect::usm_allocator" here because it will duplicate
// "case aspect::usm_system_allocations", therefore leave this macro empty
#define __SYCL_ASPECT_DEPRECATED_ALIAS(ASPECT, ID, MESSAGE)
auto getAspectNameStr = [](aspect AspectNum) -> std::string {
switch (AspectNum) {
#include <sycl/info/aspects.def>
#include <sycl/info/aspects_deprecated.def>
default:
throw sycl::exception(
errc::kernel_not_supported,
"Unknown aspect " + std::to_string(static_cast<unsigned>(AspectNum)));
}
};
#undef __SYCL_ASPECT_DEPRECATED_ALIAS
#undef __SYCL_ASPECT_DEPRECATED
#undef __SYCL_ASPECT

const RTDeviceBinaryImage &Img =
getDeviceImage(M, KSId, Context, Device, JITCompilationIsRequired);
for (RTDeviceBinaryImage::PropertyRange::ConstIterator It : ARange) {
using namespace std::literals;
if ((*It)->Name != "aspects"sv)
continue;
ByteArray Aspects = DeviceBinaryProperty(*It).asByteArray();
// 8 because we need to skip 64-bits of size of the byte array
auto *AIt = reinterpret_cast<const std::uint32_t *>(&Aspects[8]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this illegal type-punning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the same case in https://github.com/intel/llvm/blob/sycl/sycl/source/detail/program_manager/program_manager.cpp#L1241. Don't really get what you mean, could you please explain?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this because originally these aspects were uint32_t. Maybe @steffenlarsen can add some clarity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I don't like how the interface is defined (i.e. we shouldn't be doing this magic here), but that would be outside this PR's scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an old patch trying to amend the UB: #5537. It is a little stale now though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated #5537 and added changes for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've uploaded an alternative approach in #7023.

auto *AEnd =
reinterpret_cast<const std::uint32_t *>(&Aspects[0] + Aspects.size());
while (AIt != AEnd) {
auto Aspect = static_cast<aspect>(*AIt);
if (!Dev->has(Aspect))
throw sycl::exception(errc::kernel_not_supported,
"Required aspect " + getAspectNameStr(Aspect) +
" is not supported on the device");
++AIt;
}
}

auto BuildF = [this, &Img, &Context, &ContextImpl, &Device, Prg, &CompileOpts,
&LinkOpts, SpecConsts, &KernelName] {
applyOptionsFromImage(CompileOpts, LinkOpts, Img);

const detail::plugin &Plugin = ContextImpl->getPlugin();
Expand Down