|
1 | 1 | # May'21 release notes
|
2 | 2 |
|
3 | 3 | ## New features
|
| 4 | + - [ESIMD] Allowed ESIMD and regular SYCL kernels to coexist in the same |
| 5 | + translation unit and in the same program. The `-fsycl-explicit-simd` option |
| 6 | + is no longer required for compiling ESIMD code and was deprecated. DPCPP RT |
| 7 | + implicitly appends `-vc-codegen` compile option for ESIMD images. |
4 | 8 | - [ESIMD] Added indirect read and write methods to ESIMD class [8208427]
|
5 | 9 | - Provided `sycl::ONEAPI::has_known_identity` type trait to determine if
|
6 | 10 | reduction interface supports user-defined type at compile-time [0c7bd24]
|
|
22 | 26 | - Implemented zero argument version of `sycl::buffer::reinterpret()` for
|
23 | 27 | SYCL 2020 [c0c3c80]
|
24 | 28 | - Implemented [Matrix Programming Extension for DPC++](https://github.com/intel/llvm/blob/49b6749ea9175ae250b718c04d71af4ccfecc06c/sycl/doc/extensions/Matrix/dpcpp-joint-matrix.asciidoc) [35db973]
|
25 |
| - - Added support SYCL2020 style interoperability API for OpenCL backend |
26 |
| - [12e8549] [c2f211a] |
27 | 29 | - Added support for
|
28 | 30 | [SYCL_INTEL_local_memory extension](doc/extensions/LocalMemory/SYCL_INTEL_local_memory.asciidoc)
|
29 | 31 | [5a66fcb] [9a734f6]
|
|
32 | 34 |
|
33 | 35 | ## Improvements
|
34 | 36 | ### SYCL Compiler
|
35 |
| - - Added support for math builtins: `fmax`, `fmin`, `isinf`, `isfinite`, |
| 37 | + - Added support for math built-ins: `fmax`, `fmin`, `isinf`, `isfinite`, |
36 | 38 | `isnormal`, `fpclassify` [1040b94]
|
37 | 39 | - The FPGA initiation interval attribute spelling `[[intel::ii]]` is
|
38 | 40 | deprecated. The new spelling is `[[intel::initiation_interval]]`. In
|
39 |
| - addition `[[intel::initiation_interval]]` may now be used as a function |
| 41 | + addition, `[[intel::initiation_interval]]` may now be used as a function |
40 | 42 | attribute, formerly its use was limited to statement attribute [b04e6a0]
|
41 | 43 | - Added support for function attribute `[[intel::disable_loop_pipelining]]`
|
42 | 44 | and `[[intel::max_concurrency(n)]]` [7324b3e]
|
|
61 | 63 | - Provided facility for user to control execution range rounding [f6ac45f]
|
62 | 64 | - Ensured correct access mode in `sycl::handler::copy()` method [b489479]
|
63 | 65 | - Disallowed for atomic accessors in `sycl::handler::copy()` method [14437db]
|
64 |
| - - Implicitly added `-vc-codegen` compile option for ESIMD images [798b4c5] |
65 | 66 | - Provided move-assignability of `usm_allocator` class [05a805e]
|
66 |
| - - Improved performance when using `COPY_HOST_PTR` at devices without host |
67 |
| - unified memory [ad8c9d1] |
| 67 | + - Improved performance of copying data during native memory object creation |
| 68 | + on devices without host unified memory [ad8c9d1] |
68 | 69 | - [ESIMD] Added implicit set up of fence before barrier as required by hardware
|
69 | 70 | [692228c]
|
70 | 71 | - Allowed for using of interoperability program constructor with multi-device
|
71 | 72 | context [c7f7674]
|
72 | 73 | - Allowed trace of Level Zero calls only with `SYCL_PI_TRACE=-1` [ea73219]
|
73 |
| - - Added throw of `feature_not_supported` when upon attempt to create program |
74 |
| - using `create_program_with_source` with Level Zero or CUDA [ba77e3a] |
| 74 | + - Added throw of `feature_not_supported` when when upon attempt to create |
| 75 | + program using `create_program_with_source` with Level Zero or CUDA [ba77e3a] |
75 | 76 | - Added support for `inline` `cl` namespace in debugger [8e441d4]
|
76 |
| - - Added support build with GCC 7 [d8fea22] |
77 |
| - - Added in-memory caching of programs which are built with custom build options |
| 77 | + - Added support for build with GCC 7 [d8fea22] |
| 78 | + - Added in-memory caching of programs built with custom build options |
78 | 79 | [86b0e8d] [e152b0d]
|
79 | 80 | - Improved range rounding heuristics [7efb692]
|
80 | 81 | - Added `get_backend` methods to SYCL classes [ee7e99f]
|
81 |
| - - Added `sycl::sub_group::load` and `sycl::sub_group::store` versions which |
| 82 | + - Added `sycl::sub_group::load` and `sycl::sub_group::store` versions that |
82 | 83 | take raw pointers [248f550]
|
83 | 84 | - Enabled caching of devices in `sycl::device` interoperability constructors
|
84 | 85 | [d3aeb4a]
|
|
113 | 114 |
|
114 | 115 | ## Bug fixes
|
115 | 116 | ### SYCL Compiler
|
116 |
| - - Suppressed link time warning on Windows which incorrectly diagnoses |
| 117 | + - Suppressed link time warning on Windows that incorrectly diagnosed |
117 | 118 | conflicting section names while linking device binaries [8e6a3ec]
|
118 | 119 | - Disabled code coverage for device compilations [12a0b11]
|
119 | 120 | - Fixed an issue when unbundling a fat static archive and targeting non-FPGA
|
120 | 121 | device [90c79c7]
|
121 | 122 | - Addressed inconsistencies when performing compilations by using the target
|
122 | 123 | triple for FPGA (`spir64_fpga-unknown-unknown-sycldevice`) vs using
|
123 | 124 | `-fintelfpga` [c9a65fc]
|
124 |
| - - Fixed generation the output report folder when performing FPGA AOT |
| 125 | + - Fixed generation of the output report folder when performing FPGA AOT |
125 | 126 | compilations from a previously generated AOCR archive [eab4791]
|
126 | 127 | - Addressed issues dealing with improper settings when performing
|
127 | 128 | preprocessing when offloading is enabled [d03de03]
|
|
137 | 138 | `-fsycl-device-only` [3d2225a]
|
138 | 139 | ### SYCL Library
|
139 | 140 | - Fixed race-condition happening on application exit [8eb00d7] [c9c1de9]
|
140 |
| - - Fixed faulty behaviour which happened when accessing a buffer in different |
| 141 | + - Fixed faulty behaviour that happened when accessing a buffer in different |
141 | 142 | contexts using `discard_*` access mode [f75b439]
|
142 | 143 | - Fixed support for `SYCL_PROGRAM_LINK_OPTIONS` and
|
143 | 144 | `SYCL_PROGRAM_COMPILE_OPTIONS` environment variables when compiling/linking
|
|
151 | 152 | `sycl::buffer::set_final_data()` method [084d83a, 2a751bd]
|
152 | 153 | - Fixed support for `long long` in `sycl::vec::convert()` on Windows [5b49cd3]
|
153 | 154 | - Aligned local and image accessor with specification by allowing for property
|
154 |
| - list in its constructor [88fab25] |
| 155 | + list in their constructor [88fab25] |
155 | 156 | - Fixed support for offset in `parallel_for` for host device [1958715]
|
156 | 157 | - Added missing constructors for `sycl::buffer` class [bdfad9e]
|
157 | 158 | - Fixed coordinate conversion for `sampler` class on host device [cd6529f]
|
158 |
| - - Fixed for support of local accessor in debugger [fdacb75] |
| 159 | + - Fixed support for local accessors in debugger [fdacb75] |
159 | 160 | - Fixed dropping of kernel attributes when execution range rounding is used
|
160 | 161 | [496f9a0] [677a7ea]
|
161 |
| - - Added support for interoperability tasks which use `get_mem()` methods with |
| 162 | + - Added support for interoperability tasks that use `get_mem()` methods with |
162 | 163 | Level Zero plugin [149f08d]
|
163 | 164 | - Fixed sub-device caching in the Level Zero plugin [0b18b49]
|
164 | 165 | - Fixed `get_native` methods to retain reference counter in case of OpenCL
|
|
167 | 168 | they have been signaled [2a76b2a]
|
168 | 169 | - Resolved a pinned host memory specific performance regression on CUDA that
|
169 | 170 | was introduced with the host unified behavior dependent logic [3be63ab]
|
170 |
| - - Fixed illegal accesses which could happen when an application which uses |
171 |
| - host tasks exits without waiting for host tasks completion [552a521] |
| 171 | + - Fixed illegal accesses that could happen when an application that uses host |
| 172 | + tasks exits without waiting for host tasks completion [552a521] |
172 | 173 | - Fixed `sycl::event::get_info` queries that were working incorrectly when
|
173 | 174 | called on event without an encapsulated native handle [5d5a792]
|
174 | 175 | - Fixed compilation error with using multidimensional subscript for
|
175 | 176 | `sycl::accessor` with atomic access mode [0bfd34e]
|
176 |
| - - Fixed a crash which happened when an accessor which is passed to the |
177 |
| - reduction is just after being passed to reduction [b80f13e] |
| 177 | + - Fixed a crash that happened when an accessor passed to a reduction was |
| 178 | + destroyed immediately after [b80f13e] |
178 | 179 | - Fixed `sycl::device::get_info` with `sycl::info::device::max_mem_alloc_size`
|
179 | 180 | which was returning incorrect value in case of Level Zero backend [8dbaa53]
|
180 | 181 |
|
|
202 | 203 | versions of C++ RT used on app and sycl[d].dll sides.
|
203 | 204 | - The format of the object files produced by the compiler can change between
|
204 | 205 | versions. The workaround is to rebuild the application.
|
205 |
| - - Using `cl::sycl::program` API to refer to a kernel defined in another |
206 |
| - translation unit leads to undefined behavior |
| 206 | + - Using `sycl::program`/`sycl::kernel_bundle` API to refer to a kernel defined |
| 207 | + in another translation unit leads to undefined behavior |
207 | 208 | - Linkage errors with the following message:
|
208 | 209 | `error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
|
209 | 210 | can happen when a SYCL application is built using MS Visual Studio 2019
|
|
0 commit comments