Commit 06378d4
authored
fix: Initialize ApertusMLP's xielu activation using
* Fix Apertus model crash on float16 hardware
Initialize XIELU activation with correct dtype from config (using config.dtype instead of default bfloat16) to prevent promotion to float32 and subsequent crashes on Turing/float16 GPUs.
* refactor: Move `ACT2CLS` import to top-level in Apertus models.torch_dtype (#42864)1 parent fc50bdc commit 06378d4
File tree
2 files changed
+7
-2
lines changed- src/transformers/models/apertus
2 files changed
+7
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
192 | 193 | | |
193 | 194 | | |
194 | 195 | | |
195 | | - | |
| 196 | + | |
196 | 197 | | |
197 | 198 | | |
| 199 | + | |
| 200 | + | |
198 | 201 | | |
199 | 202 | | |
200 | 203 | | |
| |||
0 commit comments