-
Notifications
You must be signed in to change notification settings - Fork 1
Description
And has it always been?
Today I watched the talk mentioned in https://davidlattimore.github.io/posts/2025/09/02/rustforge-wild-performance-tricks.html (highly recommended!), where a trick is presented that might make rsor
obsolete. I think with this trick all functionality of rsor
can be achieved, but without any unsafe
code!
The core tool is a very short function:
pub fn reuse_vec<T, U>(mut v: Vec<T>) -> Vec<U> {
const {
assert!(size_of::<T>() == size_of::<U>());
assert!(align_of::<T>() == align_of::<U>());
}
v.clear();
v.into_iter().map(|_| unreachable!()).collect()
}
This can be used to implement the motivating example from https://docs.rs/rsor/0.1.5/rsor/index.html:
fn print_slice(slice: &[&str]) { for s in slice { print!("<{}>", s); } println!(); }
let mut vec = Vec::<&str>::with_capacity(2);
{
let mut vec2: Vec<&str> = reuse_vec(vec);
// In this case, it would even work with an implicit coercion:
//let mut vec2 = vec;
let one = String::from("one");
let two = String::from("two");
vec2.push(&one);
vec2.push(&two);
print_slice(&vec2);
vec = reuse_vec(vec2);
}
let three = String::from("three");
vec.push(&three);
print_slice(&vec);
Here's another example, this time showing how a flat slice can be turned into a slice of slices and how different (fat) references can be stored in the same reused memory (which rsor
cannot even do!):
let mut reusable_vec = Vec::<&[i32]>::with_capacity(10);
let mut reused_vec = reuse_vec(reusable_vec);
let mut a = [1.0f32, 0.5, 0.0, -0.5];
for c in a.chunks_mut(2) {
reused_vec.push(c);
}
let sos: &mut [&mut [f32]] = &mut reused_vec;
sos[1][0] = 0.33;
let mut another_reused_vec = reuse_vec(reused_vec);
another_reused_vec.push("hello");
reusable_vec = reuse_vec(another_reused_vec);
reusable_vec.push(&[1]);
assert_eq!(a, [1.0, 0.5, 0.33, -0.5]);
I don't think this can be coerced into the current API of rsor
, but maybe a better API can be found?
One potential downside of this approach is that it relies on a compiler optimization of .collect()
to avoid additional allocations. I guess this should always work in practice, but it is not guaranteed.