Commit 655531f
authored
[ET-VK] Store weights transposed for int8 linear (#9803)
## Context
The weight tensor of a linear layer is usually stored in a transposed manner, such that when computing the matrix multiplication, the reduction traverses along the rows of the weight tensor as opposed to the columns. This results in a better memory access pattern for CPUs.
However, for GPUs, I have found that "un-transposing" the weight tensors result in better performance. This is likely due to the fact since GPUs can compute multiple output elements in parallel, reading along the columns allows for coalescing memory loads among threads in a work group.
## Changes
* Introduce the ability to transpose height and weight dims when transferring tensor data to the GPU.
* Prepackthe weight tensor "un-transposed" for the int8 quantized linear operator
Differential Revision: [D72066588](https://our.internmc.facebook.com/intern/diff/D72066588/)1 parent bcf4b46 commit 655531f
File tree
8 files changed
+110
-26
lines changed- backends/vulkan
- runtime/graph/ops
- glsl
- impl
- test/op_tests
8 files changed
+110
-26
lines changedLines changed: 19 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
| 31 | + | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
| |||
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
44 | 61 | | |
45 | | - | |
| 62 | + | |
46 | 63 | | |
47 | 64 | | |
48 | 65 | | |
| |||
70 | 87 | | |
71 | 88 | | |
72 | 89 | | |
73 | | - | |
| 90 | + | |
74 | 91 | | |
75 | 92 | | |
76 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
32 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
33 | 40 | | |
34 | 41 | | |
35 | 42 | | |
Lines changed: 19 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
37 | 54 | | |
38 | | - | |
| 55 | + | |
39 | 56 | | |
40 | | - | |
| 57 | + | |
41 | 58 | | |
42 | 59 | | |
43 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
68 | 67 | | |
69 | | - | |
70 | | - | |
71 | | - | |
| 68 | + | |
72 | 69 | | |
73 | 70 | | |
74 | 71 | | |
75 | 72 | | |
76 | | - | |
| 73 | + | |
77 | 74 | | |
78 | 75 | | |
79 | 76 | | |
80 | 77 | | |
81 | | - | |
| 78 | + | |
82 | 79 | | |
83 | 80 | | |
84 | | - | |
| 81 | + | |
85 | 82 | | |
86 | 83 | | |
87 | 84 | | |
| |||
97 | 94 | | |
98 | 95 | | |
99 | 96 | | |
100 | | - | |
| 97 | + | |
101 | 98 | | |
102 | 99 | | |
103 | 100 | | |
104 | 101 | | |
105 | 102 | | |
| 103 | + | |
| 104 | + | |
106 | 105 | | |
107 | 106 | | |
108 | 107 | | |
109 | 108 | | |
110 | 109 | | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
119 | 118 | | |
120 | 119 | | |
121 | 120 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
89 | | - | |
| 89 | + | |
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
116 | | - | |
| 116 | + | |
| 117 | + | |
117 | 118 | | |
118 | 119 | | |
119 | 120 | | |
| |||
127 | 128 | | |
128 | 129 | | |
129 | 130 | | |
| 131 | + | |
| 132 | + | |
130 | 133 | | |
131 | 134 | | |
132 | 135 | | |
| |||
138 | 141 | | |
139 | 142 | | |
140 | 143 | | |
141 | | - | |
| 144 | + | |
142 | 145 | | |
143 | 146 | | |
144 | 147 | | |
| |||
158 | 161 | | |
159 | 162 | | |
160 | 163 | | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
161 | 191 | | |
162 | 192 | | |
163 | 193 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
54 | 66 | | |
55 | 67 | | |
56 | 68 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
| 160 | + | |
| 161 | + | |
160 | 162 | | |
161 | 163 | | |
162 | 164 | | |
163 | 165 | | |
164 | 166 | | |
165 | | - | |
| 167 | + | |
166 | 168 | | |
167 | 169 | | |
168 | 170 | | |
| |||
0 commit comments