Description
New high-level explanation of this issue, also covering #85873
Fuse<T>
and Zip<T, U>
have optimizations relying on specialization if their type parameters implement a trait (FusedIterator
or TrustedRandomAccess
, respectively).
These optimizations fundamentally change the way the iterator operates.
All type arguments are covariant. Coercing e.g. Fuse<T>
to Fuse<U>
if U
is a subtype of T
can “switch between” these fundamentally different ways of operation if T: !FusedIterator
and U: FusedIterator
which can bring the iterator into an invalid state that can cause UB; the same kind of problem exists for Zip
.
For Zip
, this problem can be avoided if TrustedRandomAccess
never differs between types and subtypes. Copy
-dependent impls for e.g. vec::IntoIter
have to be removed (PR #85874).
Regarding Fuse
: FusedIterator
is a safe public trait, this is problematic. Some possible fixes: Either change Fuse
to not get into a UB-causing invalid state by such a coercion (which kills some (or perhaps most) of the optimization offered by Fuse
), or make Fuse
invariant (but that’s a breaking change!). Here’s another possible approach, but this one requires new language features and could be done in the future if an immediate fix that just makes Fuse
less performant is taken now.
Specialization not differentiating between e.g. Foo<'a>
and Foo<'b>
made this issue harder to spot.
But fn(&())
(that is, for<'a> fn(&'a ())
) is a supertype [edited:] subtype of fn(&'static ())
and those are entirely different types.
Previous start of issue description
use std::{
iter::{Fuse, FusedIterator},
marker::PhantomData,
};
struct MyEmptyIterator<T>(PhantomData<T>, &'static str);
impl<T> Iterator for MyEmptyIterator<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
println!("called next, {}", self.1);
None
}
}
impl FusedIterator for MyEmptyIterator<fn(&'static ())> {}
fn main() {
let mut f = MyEmptyIterator::<fn(&())>(PhantomData, "Hello, World").fuse();
f.next();
let mut f = f as Fuse<MyEmptyIterator<fn(&'static ())>>;
f.next();
}
Compiling playground v0.0.1 (/playground)
Finished dev [unoptimized + debuginfo] target(s) in 1.74s
Running `target/debug/playground`
timeout: the monitored command dumped core
/playground/tools/entrypoint.sh: line 11: 8 Segmentation fault timeout --signal=KILL ${timeout} "$@"
Standard Output
called next, Hello, World
@rustbot label T-libs, T-libs-impl, T-compiler, A-specialization, A-iterators, A-typesystem, A-lifetimes, A-traits, C-bug
someone please add the unsound label thanks
Explanation
The types for<'a> fn(&'a ())
(i.e. fn(&())
) and fn(&'static ())
do seem to be distinguished by rust’s specialization. (I.e. the difference between the two types is not erased before the trait impl for specializing on FusedIterator
is chosen.) But it’s also possible to convert Fuse<MyEmptyIterator<fn(&i32)>>
into Fuse<MyEmptyIterator<fn(&'static i32)>>
. This way, the first call to .next()
will use unspecialized code and write a None
value into the field of Fuse
, whereas the second call will use the specialized version that uses intrinsics::unreachable()
to unwrap the contained Option
.
I haven’t tested it, but I’d guess that the same thing might be possible using HRTB trait objects instead of function pointers. (See comment below.)
Possible fixes
This soundness issue would probably go away if, somehow, specialization would also consider types such as fn(&())
and fn(&'static ())
to be “the same type”. Edit2: I don’t think that this approach is reasonable, and it’s probably even impossible.
The easiest / fastest fix for now would probably be to remove all the intrinsics::unreachable()
assertions in the implementation of Fuse
and replace them either with panic
s (which results in unexpected panicking but at least no unsoundness), or replace them with code that “properly” handles the None
case as the iterator being empty. In either case, performance benefits from FusedIterator
implementations could probably somewhat be retained by somehow marking the Some
case as the hot path and the None
case as a cold path. (In the approach with the None
case panicking, this would probably be automatically considered the cold path due to the panic.)
Edit: Yet another alternative fix is to make Fuse<T>
invariant over T
.