-
Notifications
You must be signed in to change notification settings - Fork 778
[Driver][SYCL] Improve dependency file behaviors for -fintelfpga #1145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Driver][SYCL] Improve dependency file behaviors for -fintelfpga #1145
Conversation
This patch improves the tool's diagnostic upon finding a SPIR kernel within an LLVM module. Despite that the tool's only current use is within the SYCL FPGA flow, it's important to make the message target-agnostic, so that the tool is not tied to a particular device BE. A related commit to the Clang driver has extended these diagnostics with SYCL FPGA specifics without affecting the tool itself. This patch also introduces testing for the return code value. For example, this should allow the Clang driver users/developers to differentiate between the two possible causes of llvm-no-spir-kernel failure. Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Alexey Bader <alexey.bader@intel.com>
intel#1141) Signed-off-by: Aleksander Fadeev <aleksander.fadeev@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just a couple more nits
// RUN: %clangxx -### -fsycl -fintelfpga %t-1.o %t-2.o 2>&1 \ | ||
// RUN: | FileCheck -check-prefix=CHK-FPGA-DEP-FILES-OBJ -DINPUT1=%t-1.o -DINPUT2=%t-2.o %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a clang-cl run be also applicable? It may be worth adding as per the usual "just in case"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a clang_cl test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add it for other tests? Sorry for missing this initially
// RUN: %clangxx -### -fsycl -fintelfpga %t-1.o %t-2.o 2>&1 \ | ||
// RUN: | FileCheck -check-prefix=CHK-FPGA-DEP-FILES-OBJ -DINPUT1=%t-1.o -DINPUT2=%t-2.o %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add it for other tests? Sorry for missing this initially
Move internal headers from include/CL/sycl to source directory to prevent implementation details leak to user application and enforce stable ABI. A few more changes were applied to make the movement possible: - addHostAccessorAndWait functions in accessor to avoid calls to RT internals from header file - Removed getImageInfo - Move buffer size acquisition from buffer constructor to SYCLMemObjT cpp to avoid calls to PI - getPluginFromContext function in context - Standard containers replaced with SYCL variants in sycl_mem_obj_i.hpp. Unique ptr replaced with shared - A few implementations moved from queue.hpp to queue.cpp - Some LIT tests temporarily include implementaion specific headers. They will be converted to unit tests later. Signed-off-by: Alexander Batashev <alexander.batashev@intel.com>
intel#1144) Since we really just want to be able to memcpy the type to the device, 'is-trivially-copyable' is not the correct trait. Since CWG1734, If we want to support trivially copyable types, we would be required to create 1 of 4 different mechanisms for having a type on the device (depending on the way the type is structured). Additionally, 2 of these ways require us to ALSO have the type be default constructible. This patch transitions to trivially-copy-constructible , so that we can simply memcpy from the existing one into new memory. Signed-off-by: Erich Keane <erich.keane@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
intel#1118) Signed-off-by: James Brodman <james.brodman@intel.com>
LowerWGScope pass performs required transformations to enable hierarchical parallelism semantics. This pass should not be skipped even if optimizations are disabled. Also some typos in the comments are fixed. Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
…el#1156) After intel#1068 has included the Demangle header, this fix to CMakeLists should guarantee successful builds in all configurations Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
SPIR-V OpGroupBroadcast accepts three forms of local ID: - scalar integer - vector integer with 2 components - vector integer with 3 components Signed-off-by: John Pennycook <john.pennycook@intel.com>
Also remove idle semicolon. Signed-off-by: Alexey Bader <alexey.bader@intel.com>
…#1162) Fix the cl_device_unified_shared_memory_capabilities_intel bitfield type name. Signed-off-by: Alexey Bader <alexey.bader@intel.com>
* [SYCL][LIBCLC] Additional libclc builtins to support SYCL work Adds builtins to libclc to support the CUDA backend for SYCL. Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] CMake and lit support for SYCL CUDA backend Adds defines CMake and lit variables used for SYCL CUDA backend development and test Contributors Alexander Johnston <alexander@codeplay.com> Bjoern Knafla <bjoern@codeplay.com> Ruyman Reyes <ruyman@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Local Accessor Support for CUDA Provides the LocalAccessorToSharedMemory compiler pass required for supporting SYCL local accessors in CUDA. Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Change __spirv_BuiltIn.. to functions Changes the following builtins to functions __spirv_BuiltInGlobalSize __spirv_BuiltInWorkgroupSize __spirv_BuiltInNumWorkgroups __spirv_BuiltInLocalInvocationId __spirv_BuiltInWorkgroupId __spirv_BuiltInGlobalOffset Contributors David Wood <david.wood@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Add SYCL CUDA support to clang driver Adds CUDA support for sycl compilation in the clang driver Contributors Alexander Johnston <alexander@codeplay.com> David Wood <david.wood@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Initial Implementation of the CUDA backend Contributors Alan Forbes <alan.forbes@codeplay.com> Alexander Johnston <alexander@codeplay.com> Bjoern Knafla <bjoern@codeplay.com> Daniel Soutar <daniel.soutar@codeplay.com> David Wood <david.wood@codeplay.com> Kumudha Narasimhan <kumudha.narasimhan@codeplay.com> Mehdi Goli <mehdi.goli@codeplay.com> Przemek Malon <przemek.malon@codeplay.com> Ruyman Reyes <ruyman@codeplay.com> Stuart Adams <stuart.adams@codeplay.com> Svetlozar Georgiev <svetlozar.georgiev@codeplay.com> Steffen Larsen <steffen.larsen@codeplay.com> Victor Lomuller <victor@codeplay.com> Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Update libclc install rules Have libclc install clc-* and libspirv-* to lib and share Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Inline cl namespace to simplify SYCL API usage Synchronise the CUDA backend with the general SYCL changes from intel#974. Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch> * Added missing flags for device-side builtins Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Removing unnecessary tool from the tree Acked-by: Victor Lomuller <victor@codeplay.com> Signed-off-by: Ruyman <ruyman@codeplay.com> * [SYCL][PI] Fix kernel group info parameter conversion Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com> * [SYCL][CUDA] Refactor __SYCL_INLINE macro Synchronise the CUDA backend with the general SYCL changes from intel#1121. Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch> * [SYCL] Have default_selector consider SYCL_BE Have the default_selector consider the env var SYCL_BE when rating device scores to make choosing a backend easier. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Select GlobalPlugin based on SYCL_BE Rather than choose the last found plugin as GlobalPlugin, select it depending on the SYCL_BE env var. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Improve default device selection checks Better checks for CUDA and OpenCL devices to match with SYCL_BE in the default device selection, based on the platform version info. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Formatting update for device_selector.cpp Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Changed CUDA unit tests to call through plugin Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com> * [SYCL] Pass SYCL_BE=PI_OPENCL in check-sycl To ensure that the check-sycl targets test OpenCL devices, pass SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which device is tested by check-sycl. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Remove PI_CUDA specific details from clang Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Disable linear_id/opencl-interop.cpp for cuda Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Further fixes to CUDA device selection Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Code style and cleanup to CUDA support Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL] Enable asserts in all buildbot builds Signed-off-by: Alexander Johnston <alexander@codeplay.com> * [SYCL][CUDA] Minor test and build configuration Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <alexander@codeplay.com> Co-authored-by: Andrea Bocci <andrea.bocci@cern.ch> Co-authored-by: Ruyman <ruyman@codeplay.com> Co-authored-by: Steffen Larsen <56076654+steffenlarsen@users.noreply.github.com>
Signed-off-by: Alexey Bader alexey.bader@intel.com Co-Authored-By: Alexander Batashev <alexbatashev@outlook.com>
Error was reproducible in two cases: - using something like `numeric_limits<half>::min()` in within another `constexpr` - not treating SYCL headers as system ones with `-Winvalid-constexpr` treated as error Signed-off-by: Alexey Sachkov <alexey.sachkov@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Event type triggers are misspelled "open"->"opened", etc. Default event type triggers should work fine. Signed-off-by: Alexey Bader <alexey.bader@intel.com>
…1053) We had issue with wrong mangling of s_upsample. I fixed it a long time ago, so we can delete workaround now. Signed-off-by: Ilya Mashkov <ilya.mashkov@intel.com>
Signed-off-by: Igor Dubinov <igor.dubinov@intel.com>
When compiling AOT for FPGA dependency files are generated that are used by the aoc compilation. Single step compilation is seamless as the dependency file is generated then immediately used. When compiling to object, keeping track of the dependency file is not so intuitive. To alleviate problems of not being able to find the dependency file, the original dependency file is bundled up with the destination fat object and when used is unbundled and passed to the aoc compilation. Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Use new FPGA dependency type which is used for unbundling of the FPGA dependency from an object. Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Use TY_FPGA_Dependencies for generated file from unbundle Clean up comments and add clang_cl specific test for dep unbundle Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
Signed-off-by: Michael D Toguchi <michael.d.toguchi@intel.com>
@mdtoguchi, it looks like something went wrong with your branch. It has irrelevant commits already present in the target branch. |
whoa. I don't know what happened. I rebased to fix the conflict. I'll start from scratch. |
Created a new PR: #1186 |
redo has been merged, closing this one. |
It is translated to a function with unmangled name __spirv_BuildNDRange_{1|2|3}D with struct return parameter and array arguments, since translator only translates it properly to SPIR-V with this signature. _ND postfix is requred because array arguments are mangled in the same way, so if there was no postfix, translator would produce functions with same name for different dimensions. Original commit: KhronosGroup/SPIRV-LLVM-Translator@a6ca745
When compiling AOT for FPGA dependency files are generated that are used
by the aoc compilation. Single step compilation is seamless as the dependency
file is generated then immediately used. When compiling to object, keeping
track of the dependency file is not so intuitive.
To alleviate problems of not being able to find the dependency file, the
original dependency file is bundled up with the destination fat object and
when used is unbundled and passed to the aoc compilation.
Signed-off-by: Michael D Toguchi michael.d.toguchi@intel.com