Skip to content

Generalization in type inference needs to generalize unbound type variables too #18653

Closed
@nikomatsakis

Description

@nikomatsakis

I came across this subtle bug while working on the operator-dispatch branch. It has taken me quite some time to reduce it to a standalone test case. The fundamental problem is that, as currently implemented, when the type inferencer process a constraint like &'_0 _1 <: _2, where '_0, _1, and _2 are variables, it will create a new region variable but leave the type variable alone, so that _2 winds up being instantiated with &'_3 _1, where _0 : _3. That is good as far as it goes, but if _1 winds up being instantiated with something that contains regions, we lose degrees of freedom. Imagine for example that _1 winds up being instantiated with &'_4 int, then (substituting back to our original constraint), we have &'_0 &'_4 int <: &'_3 &'_4 int where '_0 : '_3. Again, this is true, but not as flexible we as we might like, since '_4 appears on both sides, depriving the region inferencer the change to adjust them independently.

To actually make this bug have a problem you seem to need a lot of moving ingredients. Here is my reduced (but probably not maximally...) example of code that ought to compile but doesn't:

use std::cell::RefCell;

enum Wrap<A> {
    WrapSome(A),
    WrapNone
}

struct T;
struct U;

// The Get trait. Key part here is that Get::get(X), where X has type
// Wrap<_>, cannot be resolved immediately.

trait Get<Sized? T> {
    fn get(&self) -> &T;
}

impl Get<MyShow+'static> for Wrap<T> {
    fn get(&self) -> &MyShow+'static {
        static x: uint = 42;
        &x
    }
}

impl Get<uint> for Wrap<U> {
    fn get(&self) -> &uint {
        static x: uint = 55;
        &x
    }
}

// The MyShow trait. In the original example, this was Show, but I
// pulled it out to isolate the test from changes to libstd.

trait MyShow { fn dummy(&self) { } }
impl<'a> MyShow for &'a MyShow+'a { }
impl MyShow for uint { }

fn constrain<'a>(rc: RefCell<&'a MyShow+'a>) { }

fn main() {
    // Here we do not know the full type of collection,
    // so the type of `collection` is `Wrap<_#0>`.
    let mut collection: Wrap<_> = WrapNone;

    {
        // ...and so we cannot resolve `Get::get` to a concrete
        // instance. Hence the type of `__arg0` is just `&_#1`, and
        // we know that `Wrap<_#0> : Get<_#1>`. Later `_#1` will be
        // resolved to `&MyShow+'static`, but the inference doesn't
        // know that yet.
        let __arg0 = Get::get(&collection);

        // Here we create a `RefCell`. This instantiates the type
        // parameter of `RefCell::new` with a fresh variable (`_#2`)
        // and then requires that `&'_3 _#1 <: _#2`. Because of the
        // bug, this means that `_#2` is instantiated with `&'_4 _#1`
        // (where `'_3 : '_4`). The important part is that only thing
        // that changes is that a fresh region is created, `_#1` is
        // carried though. (Without this bug, we would introduce a
        // fresh type variable, and everything that follows woudl play
        // out differently.)
        let __args_cell = RefCell::new(__arg0);

        // Finally we call `constrain`, whose argument is of type
        // `RefCell<&'_5 MyShow+'_5>` (once bound regions are
        // instantiated). Because `RefCell` is invariant, this means:
        //
        //     RefCell<_#2> <: RefCell<&'_5 MyShow+'_5>
        //     RefCell<&'_4 _#1> <: RefCell<&'_5 MyShow+'_5> // ...expand _#2
        //     &'_4 _#1 == &'_5 MyShow+'_5 // ...RefCell is invariant
        //
        // and hence:
        //
        //     '_4 == '_5
        //     _#1 == MyShow+'_5
        //
        // Now, recall that the type of `__arg0` was `&'_3 _#1`. If we consider
        // the various constraints we've accumulatd, then `__arg0` is now
        // constrained to be `&'_3 MyShow+'_5` where `'_3 : '_4`.
        constrain(__args_cell);
    }

    // Now we inform the inferencer that `_#0` is `Wrap<T>`. When
    // trait inference runs, this will infer `_#1` to
    // `MyShow+'static`. This in turn implies that the type of
    // `__arg0` winds up as `&'static MyShow+'static`, and hence that
    // `collections` must be borrowed for the static lifetime,
    // yielding an error.
    collection = WrapSome(T);
}

I've spelled out what happens in comments. The fix for this is this simple patch:

diff --git a/src/librustc/middle/typeck/infer/combine.rs b/src/librustc/middle/typeck/infer/combine.rs
index e51eb33..5857c4d 100644
--- a/src/librustc/middle/typeck/infer/combine.rs
+++ b/src/librustc/middle/typeck/infer/combine.rs
@@ -752,9 +752,14 @@ impl<'cx, 'tcx> ty_fold::TypeFolder<'tcx> for Generalizer<'cx, 'tcx> {
                     self.cycle_detected = true;
                     ty::mk_err()
                 } else {
-                    match self.infcx.type_variables.borrow().probe(vid) {
-                        Some(u) => self.fold_ty(u),
-                        None => t,
+                    let result = self.infcx.type_variables.borrow().probe(vid);
+                    match result {
+                        Some(u) => {
+                            return self.fold_ty(u);
+                        }
+                        None => {
+                            self.infcx.next_ty_var()
+                        }
                     }
                 }
             }

which causes unbound variables to be replaced with fresh variables. I am unsure of the performance impact of this patch, though, and I want to test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions