Skip to content

Fix {to,from}_array UB when repr(simd) produces padding #342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions crates/core_simd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
#![feature(
const_ptr_read,
const_refs_to_cell,
const_maybe_uninit_as_mut_ptr,
const_mut_refs,
convert_float_to_int,
decl_macro,
intra_doc_pointers,
Expand Down
64 changes: 50 additions & 14 deletions crates/core_simd/src/vector.rs
Original file line number Diff line number Diff line change
Expand Up @@ -176,34 +176,70 @@ where
unsafe { &mut *(self as *mut Self as *mut [T; N]) }
}

/// Load a vector from an array of `T`.
///
/// This function is necessary since `repr(simd)` has padding for non-power-of-2 vectors (at the time of writing).
/// With padding, `read_unaligned` will read past the end of an array of N elements.
///
/// # Safety
/// Reading `ptr` must be safe, as if by `<*const [T; N]>::read_unaligned`.
const unsafe fn load(ptr: *const [T; N]) -> Self {
// There are potentially simpler ways to write this function, but this should result in
// LLVM `load <N x T>`

let mut tmp = core::mem::MaybeUninit::<Self>::uninit();
// SAFETY: `Simd<T, N>` always contains `N` elements of type `T`. It may have padding
// which does not need to be initialized. The safety of reading `ptr` is ensured by the
// caller.
unsafe {
core::ptr::copy_nonoverlapping(ptr, tmp.as_mut_ptr().cast(), 1);
tmp.assume_init()
}
}

/// Store a vector to an array of `T`.
///
/// See `load` as to why this function is necessary.
///
/// # Safety
/// Writing to `ptr` must be safe, as if by `<*mut [T; N]>::write_unaligned`.
const unsafe fn store(self, ptr: *mut [T; N]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want to explicitly copy self to a temporary first, and then memcpy from the temporary, since afaict llvm optimizes that to a vector load/store due to the vector insns generated by the temporary, whereas just calling memcpy generates only a memcpy so llvm doesn't see any vector operations and generates possibly less-efficient code.

compare store vs. store2 -- store2 has the temporary.
https://godbolt.org/z/jWsc6ezP4

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it intentional, though, that it's not a vector read? Because I read the OP as saying that that would generate an over-long read.

Copy link
Member

@programmerjake programmerjake Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because LLVM IR defines vector load/store instructions to never read/write the padding (when align is small enough), as i explained on Zulip (see #341). read_unaligned has no such guarantee

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we intentionally want a llvm load/store instruction with align <elem-align> not full-simd-type-align

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not written as a vector store, because that would be wrong (writing past the end of the array), but it should lower as one. I added the temporary with a comment as to why it's necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, I was mixing up the LLVM type and the Rust type.

// There are potentially simpler ways to write this function, but this should result in
// LLVM `store <N x T>`

// Creating a temporary helps LLVM turn the memcpy into a store.
let tmp = self;
// SAFETY: `Simd<T, N>` always contains `N` elements of type `T`. The safety of writing
// `ptr` is ensured by the caller.
unsafe { core::ptr::copy_nonoverlapping(tmp.as_array(), ptr, 1) }
}

/// Converts an array to a SIMD vector.
pub const fn from_array(array: [T; N]) -> Self {
// SAFETY: Transmuting between `Simd<T, N>` and `[T; N]`
// is always valid. We need to use `read_unaligned` here, since
// the array may have a lower alignment than the vector.
// SAFETY: `&array` is safe to read.
//
// FIXME: We currently use a pointer read instead of `transmute_copy` because
// it results in better codegen with optimizations disabled, but we should
// probably just use `transmute` once that works on const generic types.
// FIXME: We currently use a pointer load instead of `transmute_copy` because `repr(simd)`
// results in padding for non-power-of-2 vectors (so vectors are larger than arrays).
//
// NOTE: This deliberately doesn't just use `Self(array)`, see the comment
// on the struct definition for details.
unsafe { (&array as *const [T; N] as *const Self).read_unaligned() }
unsafe { Self::load(&array) }
}

/// Converts a SIMD vector to an array.
pub const fn to_array(self) -> [T; N] {
// SAFETY: Transmuting between `Simd<T, N>` and `[T; N]`
// is always valid. No need to use `read_unaligned` here, since
// the vector never has a lower alignment than the array.
let mut tmp = core::mem::MaybeUninit::uninit();
// SAFETY: writing to `tmp` is safe and initializes it.
//
// FIXME: We currently use a pointer read instead of `transmute_copy` because
// it results in better codegen with optimizations disabled, but we should
// probably just use `transmute` once that works on const generic types.
// FIXME: We currently use a pointer store instead of `transmute_copy` because `repr(simd)`
// results in padding for non-power-of-2 vectors (so vectors are larger than arrays).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If vectors are larger than arrays and more aligned (or the same size and same align), then I think the read is always fine for this?

Because if not, we'll have to remove https://doc.rust-lang.org/std/simd/struct.Simd.html#method.as_mut_array.

(Which, comes to think of it, means that we can always implement to_array as *self.as_array(), since we currently require T: Copy.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I changed it to *self.as_array()

//
// NOTE: This deliberately doesn't just use `self.0`, see the comment
// on the struct definition for details.
unsafe { (&self as *const Self as *const [T; N]).read() }
unsafe {
self.store(tmp.as_mut_ptr());
tmp.assume_init()
}
}

/// Converts a slice to a SIMD vector containing `slice[..N]`.
Expand Down