-
Notifications
You must be signed in to change notification settings - Fork 88
Fix {to,from}_array UB when repr(simd) produces padding #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -176,34 +176,70 @@ where | |
unsafe { &mut *(self as *mut Self as *mut [T; N]) } | ||
} | ||
|
||
/// Load a vector from an array of `T`. | ||
/// | ||
/// This function is necessary since `repr(simd)` has padding for non-power-of-2 vectors (at the time of writing). | ||
/// With padding, `read_unaligned` will read past the end of an array of N elements. | ||
/// | ||
/// # Safety | ||
/// Reading `ptr` must be safe, as if by `<*const [T; N]>::read_unaligned`. | ||
const unsafe fn load(ptr: *const [T; N]) -> Self { | ||
// There are potentially simpler ways to write this function, but this should result in | ||
// LLVM `load <N x T>` | ||
|
||
let mut tmp = core::mem::MaybeUninit::<Self>::uninit(); | ||
// SAFETY: `Simd<T, N>` always contains `N` elements of type `T`. It may have padding | ||
// which does not need to be initialized. The safety of reading `ptr` is ensured by the | ||
// caller. | ||
unsafe { | ||
core::ptr::copy_nonoverlapping(ptr, tmp.as_mut_ptr().cast(), 1); | ||
tmp.assume_init() | ||
} | ||
} | ||
|
||
/// Store a vector to an array of `T`. | ||
/// | ||
/// See `load` as to why this function is necessary. | ||
/// | ||
/// # Safety | ||
/// Writing to `ptr` must be safe, as if by `<*mut [T; N]>::write_unaligned`. | ||
const unsafe fn store(self, ptr: *mut [T; N]) { | ||
// There are potentially simpler ways to write this function, but this should result in | ||
// LLVM `store <N x T>` | ||
|
||
// Creating a temporary helps LLVM turn the memcpy into a store. | ||
let tmp = self; | ||
workingjubilee marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// SAFETY: `Simd<T, N>` always contains `N` elements of type `T`. The safety of writing | ||
// `ptr` is ensured by the caller. | ||
unsafe { core::ptr::copy_nonoverlapping(tmp.as_array(), ptr, 1) } | ||
} | ||
|
||
/// Converts an array to a SIMD vector. | ||
pub const fn from_array(array: [T; N]) -> Self { | ||
// SAFETY: Transmuting between `Simd<T, N>` and `[T; N]` | ||
// is always valid. We need to use `read_unaligned` here, since | ||
// the array may have a lower alignment than the vector. | ||
// SAFETY: `&array` is safe to read. | ||
// | ||
// FIXME: We currently use a pointer read instead of `transmute_copy` because | ||
// it results in better codegen with optimizations disabled, but we should | ||
// probably just use `transmute` once that works on const generic types. | ||
// FIXME: We currently use a pointer load instead of `transmute_copy` because `repr(simd)` | ||
// results in padding for non-power-of-2 vectors (so vectors are larger than arrays). | ||
// | ||
// NOTE: This deliberately doesn't just use `Self(array)`, see the comment | ||
// on the struct definition for details. | ||
unsafe { (&array as *const [T; N] as *const Self).read_unaligned() } | ||
unsafe { Self::load(&array) } | ||
} | ||
|
||
/// Converts a SIMD vector to an array. | ||
pub const fn to_array(self) -> [T; N] { | ||
// SAFETY: Transmuting between `Simd<T, N>` and `[T; N]` | ||
// is always valid. No need to use `read_unaligned` here, since | ||
// the vector never has a lower alignment than the array. | ||
let mut tmp = core::mem::MaybeUninit::uninit(); | ||
// SAFETY: writing to `tmp` is safe and initializes it. | ||
// | ||
// FIXME: We currently use a pointer read instead of `transmute_copy` because | ||
// it results in better codegen with optimizations disabled, but we should | ||
// probably just use `transmute` once that works on const generic types. | ||
// FIXME: We currently use a pointer store instead of `transmute_copy` because `repr(simd)` | ||
// results in padding for non-power-of-2 vectors (so vectors are larger than arrays). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If vectors are larger than arrays and more aligned (or the same size and same align), then I think the Because if not, we'll have to remove https://doc.rust-lang.org/std/simd/struct.Simd.html#method.as_mut_array. (Which, comes to think of it, means that we can always implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, I changed it to |
||
// | ||
// NOTE: This deliberately doesn't just use `self.0`, see the comment | ||
// on the struct definition for details. | ||
unsafe { (&self as *const Self as *const [T; N]).read() } | ||
unsafe { | ||
self.store(tmp.as_mut_ptr()); | ||
tmp.assume_init() | ||
} | ||
} | ||
|
||
/// Converts a slice to a SIMD vector containing `slice[..N]`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll want to explicitly copy
self
to a temporary first, and then memcpy from the temporary, since afaict llvm optimizes that to a vector load/store due to the vector insns generated by the temporary, whereas just calling memcpy generates only a memcpy so llvm doesn't see any vector operations and generates possibly less-efficient code.compare
store
vs.store2
--store2
has the temporary.https://godbolt.org/z/jWsc6ezP4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it intentional, though, that it's not a vector read? Because I read the OP as saying that that would generate an over-long read.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because LLVM IR defines vector
load
/store
instructions to never read/write the padding (whenalign
is small enough), as i explained on Zulip (see #341).read_unaligned
has no such guaranteeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we intentionally want a llvm
load
/store
instruction withalign <elem-align>
not full-simd-type-alignThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not written as a vector store, because that would be wrong (writing past the end of the array), but it should lower as one. I added the temporary with a comment as to why it's necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, I was mixing up the LLVM type and the Rust type.