Open
Description
We should create a simd_read_array
intrinsic because for sizeof(Simd<T, N>) > sizeof([T; N])
(which can happen until #319 is fixed) read_unaligned
is probably UB due to being able to read the bytes beyond the end of the input array -- the padding in the Simd<T, N>
.
We need an intrinsic rather than just using memcpy
because the intrinsic will generate llvm's load
instruction with vector type (llvm guarantees vector load
won't read padding if the load
's align
is small enough), whereas memcpy
may end up using less efficient array-typed loads which sometimes use scalar code.
Metadata
Metadata
Assignees
Labels
No labels