File tree Expand file tree Collapse file tree 2 files changed +46
-0
lines changed Expand file tree Collapse file tree 2 files changed +46
-0
lines changed Original file line number Diff line number Diff line change @@ -6,6 +6,7 @@ Numba for DPPY GPUs
6
6
.. toctree ::
7
7
:maxdepth: 2
8
8
9
+ writing_kernels.rst
9
10
memory-management.rst
10
11
device-functions.rst
11
12
atomic-operations.rst
Original file line number Diff line number Diff line change
1
+ Writing DPPY kernels
2
+ ====================
3
+
4
+ Introduction
5
+ -------------
6
+
7
+ Numba-dppy has an execution model unlike the traditional sequential model used for programming CPUs.
8
+ In Numba-dppy, the code you write will be executed by multiple threads at once (often hundreds or thousands).
9
+ Your solution will be modeled by defining a thread hierarchy of work-groups and work-items.
10
+
11
+ Numba-dppy support exposes facilities to declare and manage this hierarchy of threads.
12
+ The facilities are largely similar to those exposed by `OpenCL language <https://www.khronos.org/opencl/ >`_.
13
+
14
+ Kernel declaration
15
+ ------------------
16
+ A kernel function is a GPU function that is meant to be called from CPU code. It gives it
17
+ two fundamental characteristics:
18
+
19
+ - kernels cannot explicitly return a value; all result data must be written to an array passed to the function
20
+ (if computing a scalar, you will probably pass a one-element array)
21
+ - kernels explicitly declare their thread hierarchy when called: i.e. the number of thread blocks and the number
22
+ of threads per block (note that while a kernel is compiled once, it can be called multiple times with different
23
+ block sizes or grid sizes).
24
+
25
+ Example
26
+ ~~~~~~~~~
27
+
28
+ .. literalinclude :: ../../numba_dppy/examples/sum.py
29
+
30
+ Kernel invocation
31
+ ------------------
32
+
33
+ A kernel is typically launched in the following way:
34
+
35
+ .. literalinclude :: ../../numba_dppy/examples/sum.py
36
+ :pyobject: driver
37
+
38
+ Positioning
39
+ ------------
40
+
41
+ - ``dppy.get_local_id ``
42
+ - ``dppy.get_local_size ``
43
+ - ``dppy.get_group_id ``
44
+ - ``dppy.get_num_groups ``
45
+
You can’t perform that action at this time.
0 commit comments