Description
There is now a serious proposal and plan to stabilize #![feature(min_const_generics)]
in Rust 1.50: rust-lang/rust#79135. Using it here and removing the Array
trait would a welcome simplification but obviously a breaking API change, so it would need to be part of a version 2.0 of the crate.
I’m wondering whether it would make sense to try a complete rewrite of smallvec
instead of adapting the existing code. Some ideas:
-
The current
capacity
field has a misleading name and non-obvious meaning, it can represent either the capacity or the length of the vector. I feel this makes the code harder to understand and keep correct than necessary. Instead we could store a length in the0..isize::MAX
range, and pack a tag in the remaining bit. -
The dual representation with conditional compilation can be skipped/removed entirely, as
union
s withManuallyDrop<T>
fields withT: !Copy
are now stable: stabilize union with 'ManuallyDrop' fields and 'impl Drop for Union' rust-lang/rust#77547 -
The "core" of the crate can be rather small (and in a separate module and file), with everything else mostly copied from
vec.rs
andraw_vec.rs
from the standard library -
Copying from the standard library again gets us more recent improvements, such as making some code less generic to reduce monomorphization bloat.
-
This would be a good time to look at
unsafe
code -
2.0 can be an opportunity to remove some APIs. For example,
IntoIterator for SmallVec
implies an expensive move. We could omit it and document to usedrain
instead.
I’d consider not necessarily including everything in #183, as specialization and custom allocators are likely not gonna be stable before or soon after min_const_generics
is. Adding those might not require breaking API changes, though?
I’ve started playing with this a little. Here is the core data structure:
pub struct SmallVec<Item, const INLINE_CAPACITY: usize> {
// Safety invariants:
// * The active union field of `storage` must be consistent with the tag of `tagged_len`.
// * The length in `tagged_len` must not be greater than storage capacity
// * The first `length` items of the storage must be initialized.
tagged_len: TaggedLen,
storage: Storage<MaybeUninit<Item>, INLINE_CAPACITY>,
}
/// Upper bit: 0 for inline storage, 1 for heap storage
/// The rest of the bits: SmallVec length = number of items initialized, at the start of storage.
pub(crate) struct TaggedLen(usize);
/// An array, either inline (of statically-fixed size) or heap-allocated (of dynamic size).
/// The tag of [`SmallVec::tagged_len`] indicates which.
/// The size of this array is the `SmallVec`’s capacity.
union Storage<T, const INLINE_SIZE: usize> {
inline: ManuallyDrop<[T; INLINE_SIZE]>,
heap: ManuallyDrop<Box<[T]>>,
}
The union
and tagged integer wrangling as well as heap (re/de)allocations could be in a private module, with the rest of the functionality (push
, drain
, try_reserve
, …) built on top of an abstraction such as:
impl<Item, const INLINE_CAPACITY: usize> SmallVec<Item, INLINE_CAPACITY> {
pub fn new() -> Self {…}
pub fn with_capacity(capacity: usize) -> Self {…}
pub fn len(&self) -> usize {…}
pub unsafe fn set_len(&mut self, new_len: usize) {…}
pub(crate) fn storage(&self) -> &[MaybeUninit<Item>] {…}
pub(crate) fn storage_mut(&mut self) -> &mut [MaybeUninit<Item>] {…}
pub(crate) fn grow_to(&mut self, new_capacity: usize) -> Result<(), TryReserveError> {…}
pub(crate) fn shrink_to(&mut self, new_capacity: usize) -> Result<(), TryReserveError> {…}
}