Description
Consider the following code (playground):
#![allow(dead_code)]
use std::mem::size_of;
use std::sync::Arc;
enum Foo {
A(&'static str),
B(Arc<()>),
}
enum Bar {
A(&'static str),
B(Option<Arc<()>>),
}
fn main() {
println!("sizes of Foo: {} w/ Option {}", size_of::<Foo>(), size_of::<Option<Foo>>());
print!("repr of Foo::A(..): ");
dump_repr(Foo::A("hello"));
print!("repr of Foo::B(..): ");
dump_repr(Foo::B(Arc::new(())));
print!("repr of Option<Foo>::None: ");
dump_repr(Option::<Foo>::None);
print!("repr of Option<Foo>::Some(..): ");
dump_repr(Option::<Foo>::Some(Foo::A("hello")));
println!("sizes of Bar: {} w/ Option {}", size_of::<Bar>(), size_of::<Option<Bar>>());
print!("repr of Bar::A(..): ");
dump_repr(Bar::A("hello"));
print!("repr of Bar::B(Some(..)): ");
dump_repr(Bar::B(Some(Arc::new(()))));
print!("repr of Bar::B(None): ");
dump_repr(Bar::B(None));
}
fn dump_repr<T>(a: T) {
let sz = size_of::<T>();
let usize_sz = size_of::<usize>();
assert_eq!(sz % usize_sz, 0);
let n_usize = sz / usize_sz;
let b = &a as *const T as *const usize;
let b = unsafe { std::slice::from_raw_parts(b, n_usize) };
for word in b {
print!("{:016x} ", word);
}
println!();
}
The Foo
enum contains two niches:
- The slice in
Foo::A(&str)
has a non-nullable field. - The
Arc
inFoo::B(Arc<_>)
has a non-nullable field.
The compiler correctly places the distinction between the variants A and B in one of these fields, concretely in the non-nullable field of the slice. The pointer for the Arc is then placed in the second half of the memory, which is otherwise the slice's length.
Unfortunately, wrapping Foo
in Option<_>
does not make use of the remaining niche: It could set both niches to null (basically selecting the Foo::B variant and nulling the Arc's NotNull pointer) to indicate the None
variant.
Instead, we get a separate discriminant field, as can be seen by running the above example.
Note that the niche in Arc is not used at all by Foo
. This can be seen in the Bar
version of the enum, whose size does not increase by changing the Arc<_>
into an Option<Arc<_>>
.
I honestly have no idea about how niche tracking works and how much work it is to implement this, but I thought I'd leave this here as a kind of inspiration or tracking issue.
I note that there are a couple issues around this, but I think that this is not a duplicate of:
-
Missed enum layout optimization with a NonZeroU64 + more space in an enum #101567 because that discusses the use of multiple niches within the same enum, which I understand has invisible performance drawbacks (needs more than one load to understand which variant we're looking at).
I think that does not apply because
Option<Foo>
, as it stands, also needs two loads to figure out whichFoo
variant (if any) we're looking at: One to see if the Option is None, and a second one to see if the slice's ptr is null (Foo::B
) or not (Foo::A
). -
Missed layout optimisation with multiple niches #119507 could be related, but it's about structs. Not sure if that qualifies as dup.
-
Enums don't utilize niches across multiple data-carrying variants #121333 seems to be about attempting to exploit "opposite" niches in order to unify types.
Thank you for reading.