|
| 1 | +# SYCL(TM) Proposal: Intel's Extensions for Device Information |
| 2 | + |
| 3 | +**IMPORTANT**: This specification is a draft. |
| 4 | + |
| 5 | +**NOTE**: Khronos(R) is a registered trademark and SYCL(TM) is a trademark of the Khronos Group, Inc. |
| 6 | + |
| 7 | +New device descriptors will be added to provide access to low-level hardware details about Intel GPU devices. This information will be useful to developers tuning specifically for those devices. |
| 8 | + |
| 9 | +This proposal details what is required to provide this information as a SYCL extensions. |
| 10 | + |
| 11 | +## Feature Test Macro ## |
| 12 | + |
| 13 | +The Feature Test Macro will be defined as: |
| 14 | + |
| 15 | + #define SYCL_EXT_INTEL_DEVICE_INFO 1 |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | +# PCI Address # |
| 20 | + |
| 21 | +A new device descriptor will be added which will provide the PCI address in BDF format. BDF format contains the address as: `domain:bus:device.function`. |
| 22 | + |
| 23 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 24 | + |
| 25 | + |
| 26 | +## Device Information Descriptors ## |
| 27 | + |
| 28 | +| Device Descriptors | Return Type | Description | |
| 29 | +| ------------------ | ----------- | ----------- | |
| 30 | +| info\:\:device\:\:ext\_intel\_pci\_address | std\:\:string | For Level Zero BE, returns the PCI address in BDF format: `domain:bus:device.function`.| |
| 31 | + |
| 32 | + |
| 33 | +## Aspects ## |
| 34 | + |
| 35 | +A new aspect, ext\_intel\_pci\_address, will be added. |
| 36 | + |
| 37 | +## Error Condition ## |
| 38 | + |
| 39 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_pci\_address. |
| 40 | + |
| 41 | + |
| 42 | +## Example Usage ## |
| 43 | + |
| 44 | +The PCI address can be obtained using the standard get\_info() interface. |
| 45 | + |
| 46 | + if (dev.has(aspect::ext_intel_pci_address)) { |
| 47 | + auto BDF = dev.get_info<info::device::ext_intel_pci_address>(); |
| 48 | + } |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | +# Intel GPU Execution Unit SIMD Width # |
| 53 | + |
| 54 | +A new device descriptor will be added which will provide the physical SIMD width of an execution unit on an Intel GPU. This data will be used to calculate the computational capabilities of the device. |
| 55 | + |
| 56 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 57 | + |
| 58 | + |
| 59 | +## Device Information Descriptors ## |
| 60 | + |
| 61 | +| Device Descriptors | Return Type | Description | |
| 62 | +| ------------------ | ----------- | ----------- | |
| 63 | +| info\:\:device\:\:ext\_intel\_gpu\_eu\_simd\_width | uint32\_t| Returns the physical SIMD width of the execution unit (EU).| |
| 64 | + |
| 65 | + |
| 66 | +## Aspects ## |
| 67 | + |
| 68 | +A new aspect, ext\_intel\_gpu\_eu\_simd\_width, will be added. |
| 69 | + |
| 70 | + |
| 71 | +## Error Condition ## |
| 72 | + |
| 73 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_gpu\_eu\_simd\_width. |
| 74 | + |
| 75 | +## Example Usage ## |
| 76 | + |
| 77 | +The physical EU SIMD width can be obtained using the standard get\_info() interface. |
| 78 | + |
| 79 | + if (dev.has(aspect::ext_intel_gpu_eu_simd_width)) { |
| 80 | + auto euSimdWidth = dev.get_info<info::device::ext_intel_gpu_eu_simd_width>(); |
| 81 | + } |
| 82 | + |
| 83 | + |
| 84 | + |
| 85 | +# Intel GPU Execution Unit Count # |
| 86 | + |
| 87 | +A new device descriptor will be added which will provide the number of execution units on an Intel GPU. If the device is a subdevice, then the number of EUs in the subdevice is returned. |
| 88 | + |
| 89 | +This new device descriptor will provide the same information as "max\_compute\_units" does today. We would like to have an API which is specific for Intel GPUs. |
| 90 | + |
| 91 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 92 | + |
| 93 | + |
| 94 | +## Device Information Descriptors ## |
| 95 | + |
| 96 | +| Device Descriptors | Return Type | Description | |
| 97 | +| ------------------ | ----------- | ----------- | |
| 98 | +| info\:\:device\:\:ext\_intel\_gpu\__eu\_count | uint32\_t| Returns the number of execution units (EUs) associated with the Intel GPU.| |
| 99 | + |
| 100 | + |
| 101 | +## Aspects ## |
| 102 | + |
| 103 | +A new aspect, ext\_intel\_gpu\_eu\_count, will be added. |
| 104 | + |
| 105 | + |
| 106 | +## Error Condition ## |
| 107 | + |
| 108 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_gpu\_eu\_count. |
| 109 | + |
| 110 | +## Example Usage ## |
| 111 | + |
| 112 | +Then the number of EUs can be obtained using the standard get\_info() interface. |
| 113 | + |
| 114 | + if (dev.has(aspect::ext_intel_gpu_eu_count)) { |
| 115 | + auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count>(); |
| 116 | + } |
| 117 | + |
| 118 | + |
| 119 | + |
| 120 | +# Intel GPU Number of Slices # |
| 121 | + |
| 122 | +A new device descriptor will be added which will provide the number of slices on an Intel GPU. If the device is a subdevice, then the number of slices in the subdevice is returned. |
| 123 | + |
| 124 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 125 | + |
| 126 | + |
| 127 | +## Device Information Descriptors ## |
| 128 | + |
| 129 | +| Device Descriptors | Return Type | Description | |
| 130 | +| ------------------ | ----------- | ----------- | |
| 131 | +| info\:\:device\:\:ext\_intel\_gpu\_slices | uint32\_t| Returns the number of slices.| |
| 132 | + |
| 133 | + |
| 134 | +## Aspects ## |
| 135 | + |
| 136 | +A new aspect, ext\_intel\_gpu\_slices, will be added. |
| 137 | + |
| 138 | + |
| 139 | +## Error Condition ## |
| 140 | + |
| 141 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_gpu\_slices. |
| 142 | + |
| 143 | +## Example Usage ## |
| 144 | + |
| 145 | +Then the number of slices can be obtained using the standard get\_info() interface. |
| 146 | + |
| 147 | + if (dev.has(aspect::ext_intel_gpu_slices)) { |
| 148 | + auto slices = dev.get_info<info::device::ext_intel_gpu_slices>(); |
| 149 | + } |
| 150 | + |
| 151 | + |
| 152 | +# Intel GPU Number of Subslices per Slice # |
| 153 | + |
| 154 | +A new device descriptor will be added which will provide the number of subslices per slice on an Intel GPU. If the device is a subdevice, then the number of subslices per slice in the subdevice is returned. |
| 155 | + |
| 156 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 157 | + |
| 158 | + |
| 159 | +## Device Information Descriptors ## |
| 160 | + |
| 161 | +| Device Descriptors | Return Type | Description | |
| 162 | +| ------------------ | ----------- | ----------- | |
| 163 | +| info\:\:device\:\:ext\_intel\_gpu\_subslices\_per\_slice | uint32\_t| Returns the number of subslices per slice.| |
| 164 | + |
| 165 | + |
| 166 | +## Aspects ## |
| 167 | + |
| 168 | +A new aspect, ext\_intel\_gpu\_subslices\_per\_slice, will be added. |
| 169 | + |
| 170 | + |
| 171 | +## Error Condition ## |
| 172 | + |
| 173 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_gpu\_subslices\_per\_slice. |
| 174 | + |
| 175 | +## Example Usage ## |
| 176 | + |
| 177 | +Then the number of subslices per slice can be obtained using the standard get\_info() interface. |
| 178 | + |
| 179 | + if (dev.has(aspect::ext_intel_gpu_subslices_per_slice)) { |
| 180 | + auto subslices = dev.get_info<info::device::ext_intel_gpu_subslices_per_slice>(); |
| 181 | + } |
| 182 | + |
| 183 | + |
| 184 | +# Intel GPU Number of Execution Units (EUs) per Subslice # |
| 185 | + |
| 186 | +A new device descriptor will be added which will provide the number of EUs per subslice on an Intel GPU. If the device is a subdevice, then the number of EUs per subslice in the subdevice is returned. |
| 187 | + |
| 188 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 189 | + |
| 190 | + |
| 191 | +## Device Information Descriptors ## |
| 192 | + |
| 193 | +| Device Descriptors | Return Type | Description | |
| 194 | +| ------------------ | ----------- | ----------- | |
| 195 | +| info\:\:device\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice | uint32\_t| Returns the number of EUs in a subslice.| |
| 196 | + |
| 197 | + |
| 198 | +## Aspects ## |
| 199 | + |
| 200 | +A new aspect, ext\_intel\_gpu\_eu\_count\_per\_subslice, will be added. |
| 201 | + |
| 202 | + |
| 203 | +## Error Condition ## |
| 204 | + |
| 205 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice. |
| 206 | + |
| 207 | +## Example Usage ## |
| 208 | + |
| 209 | +Then the number of EUs per subslice can be obtained using the standard get\_info() interface. |
| 210 | + |
| 211 | + if (dev.has(aspect::ext_intel_gpu_eu_count_per_subslice)) { |
| 212 | + auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count_per_subslice>(); |
| 213 | + } |
| 214 | + |
| 215 | + |
| 216 | +# Maximum Memory Bandwidth # |
| 217 | + |
| 218 | +A new device descriptor will be added which will provide the maximum memory bandwidth. If the device is a subdevice, then the maximum bandwidth of the subdevice is returned. |
| 219 | + |
| 220 | +This new device descriptor is only available for devices in the Level Zero platform, and the matching aspect is only true for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform. |
| 221 | + |
| 222 | + |
| 223 | +## Device Information Descriptors ## |
| 224 | + |
| 225 | +| Device Descriptors | Return Type | Description | |
| 226 | +| ------------------ | ----------- | ----------- | |
| 227 | +| info\:\:device\:\:ext\_intel\_max\_mem\_bandwidth | uint64\_t| Returns the maximum memory bandwidth in units of bytes\/second.| |
| 228 | + |
| 229 | + |
| 230 | +## Aspects ## |
| 231 | + |
| 232 | +A new aspect, ext\_intel\_max\_mem\_bandwidth, will be added. |
| 233 | + |
| 234 | + |
| 235 | +## Error Condition ## |
| 236 | + |
| 237 | +An invalid object runtime error will be thrown if the device does not support aspect\:\:ext\_intel\_max\_mem\_bandwidth. |
| 238 | + |
| 239 | + |
| 240 | +## Example Usage ## |
| 241 | + |
| 242 | +Then the maximum memory bandwidth can be obtained using the standard get\_info() interface. |
| 243 | + |
| 244 | + if (dev.has(aspect::ext_intel_max_mem_bandwidth)) { |
| 245 | + auto maxBW = dev.get_info<info::device::ext_intel_max_mem_bandwidth>(); |
| 246 | + } |
0 commit comments