Skip to content

iter::Fuse is unsound with how specialization currently behaves around HRTB fn pointers #85863

Closed

Description

New high-level explanation of this issue, also covering #85873

Fuse<T> and Zip<T, U> have optimizations relying on specialization if their type parameters implement a trait (FusedIterator or TrustedRandomAccess, respectively).

These optimizations fundamentally change the way the iterator operates.

All type arguments are covariant. Coercing e.g. Fuse<T> to Fuse<U> if U is a subtype of T can “switch between” these fundamentally different ways of operation if T: !FusedIterator and U: FusedIterator which can bring the iterator into an invalid state that can cause UB; the same kind of problem exists for Zip.

For Zip, this problem can be avoided if TrustedRandomAccess never differs between types and subtypes. Copy-dependent impls for e.g. vec::IntoIter have to be removed (PR #85874).

Regarding Fuse: FusedIterator is a safe public trait, this is problematic. Some possible fixes: Either change Fuse to not get into a UB-causing invalid state by such a coercion (which kills some (or perhaps most) of the optimization offered by Fuse), or make Fuse invariant (but that’s a breaking change!). Here’s another possible approach, but this one requires new language features and could be done in the future if an immediate fix that just makes Fuse less performant is taken now.

Specialization not differentiating between e.g. Foo<'a> and Foo<'b> made this issue harder to spot.
But fn(&()) (that is, for<'a> fn(&'a ())) is a supertype [edited:] subtype of fn(&'static ()) and those are entirely different types.


Previous start of issue description


use std::{
    iter::{Fuse, FusedIterator},
    marker::PhantomData,
};

struct MyEmptyIterator<T>(PhantomData<T>, &'static str);
impl<T> Iterator for MyEmptyIterator<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        println!("called next, {}", self.1);
        None
    }
}

impl FusedIterator for MyEmptyIterator<fn(&'static ())> {}

fn main() {
    let mut f = MyEmptyIterator::<fn(&())>(PhantomData, "Hello, World").fuse();
    f.next();
    let mut f = f as Fuse<MyEmptyIterator<fn(&'static ())>>;
    f.next();
}
   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 1.74s
     Running `target/debug/playground`
timeout: the monitored command dumped core
/playground/tools/entrypoint.sh: line 11:     8 Segmentation fault      timeout --signal=KILL ${timeout} "$@"
Standard Output
called next, Hello, World

(playground)

@rustbot label T-libs, T-libs-impl, T-compiler, A-specialization, A-iterators, A-typesystem, A-lifetimes, A-traits, C-bug
someone please add the unsound label thanks

Explanation

The types for<'a> fn(&'a ()) (i.e. fn(&())) and fn(&'static ()) do seem to be distinguished by rust’s specialization. (I.e. the difference between the two types is not erased before the trait impl for specializing on FusedIterator is chosen.) But it’s also possible to convert Fuse<MyEmptyIterator<fn(&i32)>> into Fuse<MyEmptyIterator<fn(&'static i32)>>. This way, the first call to .next() will use unspecialized code and write a None value into the field of Fuse, whereas the second call will use the specialized version that uses intrinsics::unreachable() to unwrap the contained Option.

I haven’t tested it, but I’d guess that the same thing might be possible using HRTB trait objects instead of function pointers. (See comment below.)

Possible fixes

This soundness issue would probably go away if, somehow, specialization would also consider types such as fn(&()) and fn(&'static ()) to be “the same type”. Edit2: I don’t think that this approach is reasonable, and it’s probably even impossible.

The easiest / fastest fix for now would probably be to remove all the intrinsics::unreachable() assertions in the implementation of Fuse and replace them either with panics (which results in unexpected panicking but at least no unsoundness), or replace them with code that “properly” handles the None case as the iterator being empty. In either case, performance benefits from FusedIterator implementations could probably somewhat be retained by somehow marking the Some case as the hot path and the None case as a cold path. (In the approach with the None case panicking, this would probably be automatically considered the cold path due to the panic.)

Edit: Yet another alternative fix is to make Fuse<T> invariant over T.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-iteratorsArea: IteratorsA-lifetimesArea: Lifetimes / regionsA-specializationArea: Trait impl specializationA-traitsArea: Trait systemA-typesystemArea: The type systemC-bugCategory: This is a bug.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessP-criticalCritical priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-langRelevant to the language team, which will review and decide on the PR/issue.T-libsRelevant to the library team, which will review and decide on the PR/issue.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions