Tags: mmoadeli/llvm
Tags
[ESIMD] Fix inconsistencies in the ESIMD API signatures. (intel#4800) This is API breaking patch. - Remove 'esimd_' prefix from public API names. This prefix is redundant, because all APIs are in ...esimd:: namespace already. API names follow C-for-Metal naming. - esimd_sat -> saturate Memory access API changes: - Un-deprecate block_load/store - rename slm_atomic -> slm_atomic_update, flat_atomic -> atomic_update. atomic is not used to avoid conflict with C++ atomic class - Offset is now expected in bytes(was in elements). (in scatter, gather, scalar_load, scalar_store) - Make 'offsets' argument preceed the 'val' argument to be consistent with other memory APIs. (in scatter, scatter_rgba, slm_scatter) - Remove unused L1 / L3 CachHint template parameters. (in scatter, gather, scalar_load, scalar_store, scatter_rgba, gather_rgba) - Rename enum EsimdFenceMask -> enum fence_mask and its elements (deprecate old ones) Old behavior is preserved in deprecated APIs : scatter->scatter1 gather->gather1 scalar_load->scalar_load1 scalar_store->scalar_store1 gather_rgba->gather4 scatter_rgba->scatter4 slm_scatter->slm_store Signed-off-by: Konstantin S Bobrovsky konstantin.s.bobrovsky@intel.com
[sycl-post-link] Fix calculation of bool spec constant size (intel#4819) Switched to use `DataLayout` instead of `[Scalar/Primitive]TypeSize`, because the latter may not reflect the size of memory allocated for an instance of the type or the number of bytes that are written when an instance of the type is stored to memory. This fixes a few occurrences where booleans considered to be zero bytes long
[SYCL][CUDA][HIP] Update CUDA and HIP libspirv file diagnostic errors (… …intel#4804) This patch updates CUDA and HIP's diagnostic errors for libspirv. Currently, they do not reflect that the libspirv file becomes remangled and is looking for different variants depending upon the OS. The HIP error also throws the same error as CUDA creating an incorrect message. This patch resolves these two problems and creates adds HIP tests for the error messages. This is proposed as a solution to fix intel#4370
[Driver][SYCL][FPGA] Implied default device forces emulation (intel#4801 ) When using -Xshardware or -Xssimulation on the command line with FPGA, the device AOT compilation should be performed by aoc. This was not occuring due to the implied default device when one such object or archive was supplied on the command line. Make adjustments to our discovery of FPGA hardware/simulation mode by only checking the actual FPGA toolchain and not just the first offload toolchain encountered.
[Driver][SYCL] Fix -Xsycl option triple check with multiple -fsycl-ta… …rgets (intel#4789) When passing multiple -fsycl-targets on the command line, the last one wins. There is a check for the number of target triples provided by the user when applying -Xsycl-target* options to the device compilation. We were improperly counting all of the -fsycl-targets values instead of only taking into account the last one only.
[SPIR-V][Doc] Add SPV_INTEL_uniform_group_instructions extension spec (… …intel#4794) Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
[CODEOWNERS] Fix codeowners' directory for SPIR-V specs (intel#4795) Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
[SYCL] Include backend-specific header if exists (intel#4783) https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:headers-and-namespaces introduces the backend specific headers "sycl/backend/<backend_name>.hpp" CTS tests assume that if backend macro is defined (e.g. SYCL_BACKEND_OPENCL) it is fully functional without extra includes. So the patch adds include of the backend specific header when the backend is enabled. The change should fix the recent massive failure of CTS tests with the error: implicit instantiation of undefined template 'sycl::interop<sycl::backend::opencl, sycl::platform>' Co-authored-by: Mikhail Lychkov <mikhail.lychkov@intel.com> Co-authored-by: Mikhail Lychkov <mikhail.lychkov@intel.com>
[SYCL][LIBCLC] Change __clc_size_t to unsigned (intel#4784) This changes the `__clc_size_t` from a signed 64-bit int to an unsigned 64-bit int. The change results in the correct mangled name for `GroupAsyncCopy` built-ins. This is a proposed solution to issue intel#4502
PreviousNext