|
1 |
| -# Early and Late Bound Variables |
| 1 | +# Early and Late Bound Parameter Definitions |
2 | 2 |
|
3 |
| -In Rust, item definitions (like `fn`) can often have generic parameters, which |
4 |
| -are always [_universally_ quantified][quant]. That is, if you have a function |
5 |
| -like |
| 3 | +Understanding this page likely requires a rudimentary understanding of higher ranked |
| 4 | +trait bounds/`for<'a>`and also what types such as `dyn for<'a> Trait<'a>` and |
| 5 | +`for<'a> fn(&'a u32)` mean. Reading [the nomincon chapter](https://doc.rust-lang.org/nomicon/hrtb.html) |
| 6 | +on HRTB may be useful for understanding this syntax. The meaning of `for<'a> fn(&'a u32)` |
| 7 | +is incredibly similar to the meaning of `T: for<'a> Trait<'a>`. |
6 | 8 |
|
| 9 | +If you are looking for information on the `RegionKind` variants `ReLateBound` and `ReEarlyBound` |
| 10 | +you should look at the section on [bound vars and params](./bound-vars-and-params.md). This section |
| 11 | +discusses what makes generic parameters on functions and closures late/early bound. Not the general |
| 12 | +concept of bound vars and generic parameters which `RegionKind` has named somewhat confusingly |
| 13 | +with this topic. |
| 14 | + |
| 15 | +## What does it mean for parameters to be early or late bound |
| 16 | + |
| 17 | +All function definitions conceptually have a zst (this is represented by `TyKind::FnDef` in rustc). |
| 18 | +The only generics on this zst are the early bound parameters of the function definition. e.g. |
7 | 19 | ```rust
|
8 |
| -fn foo<T>(x: T) { } |
| 20 | +fn foo<'a>(_: &'a u32) {} |
| 21 | + |
| 22 | +fn main() { |
| 23 | + let b = foo; |
| 24 | + // ^ `b` has type `FnDef(foo, [])` (no substs because `'a` is late bound) |
| 25 | + assert!(std::mem::size_of_val(&b) == 0); |
| 26 | +} |
9 | 27 | ```
|
10 | 28 |
|
11 |
| -this function is defined "for all T" (not "for some specific T", which would be |
12 |
| -[_existentially_ quantified][quant]). |
| 29 | +In order to call `b` the late bound parameters do need to be provided, these are inferred at the |
| 30 | +call site instead of when we refer to `foo`. |
| 31 | +```rust |
| 32 | +fn main() { |
| 33 | + let b = foo; |
| 34 | + let a: &'static u32 = &10; |
| 35 | + foo(a); |
| 36 | + // the lifetime argument for `'a` on `foo` is inferred at the callsite |
| 37 | + // the generic parameter `'a` on `foo` is inferred to `'static` here |
| 38 | +} |
| 39 | +``` |
| 40 | + |
| 41 | +Because late bound parameters are not part of the `FnDef`'s substs this allows us to prove trait |
| 42 | +bounds such as `F: for<'a> Fn(&'a u32)` where `F` is `foo`'s `FnDef`. e.g. |
| 43 | +```rust |
| 44 | +fn foo_early<'a, T: Trait<'a>>(_: &'a u32, _: T) {} |
| 45 | +fn foo_late<'a, T>(_: &'a u32, _: T) {} |
| 46 | + |
| 47 | +fn accepts_hr_func<F: for<'a> Fn(&'a u32, u32)>(_: F) {} |
| 48 | + |
| 49 | +fn main() { |
| 50 | + // doesnt work, the substituted bound is `for<'a> FnDef<'?0>: Fn(&'a u32, u32)` |
| 51 | + // `foo_early` only implements `for<'a> FnDef<'a>: Fn(&'a u32, u32)`- the lifetime |
| 52 | + // of the borrow in the function argument must be the same as the lifetime |
| 53 | + // on the `FnDef`. |
| 54 | + accepts_hr_func(foo_early); |
| 55 | + |
| 56 | + // works, the substituted bound is `for<'a> FnDef: Fn(&'a u32, u32)` |
| 57 | + accepts_hr_func(foo_late); |
| 58 | +} |
| 59 | + |
| 60 | +// the builtin `Fn` impls for `foo_early` and `foo_late` look something like: |
| 61 | +// `foo_early` |
| 62 | +impl<'a, T: Trait<'a>> Fn(&'a u32, T) for FooEarlyFnDef<'a, T> { ... } |
| 63 | +// `foo_late` |
| 64 | +impl<'a, T> Fn(&'a u32, T) for FooLateFnDef<T> { ... } |
| 65 | + |
| 66 | +``` |
| 67 | + |
| 68 | +Early bound parameters are present on the `FnDef`. Late bound generic parameters are not present |
| 69 | +on the `FnDef` but are instead constrained by the builtin `Fn*` impl. |
| 70 | + |
| 71 | +The same distinction applies to closures. Instead of `FnDef` we are talking about the anonymous |
| 72 | +closure type. Closures are [currently unsound](https://github.com/rust-lang/rust/issues/84366) in |
| 73 | +ways that are closely related to the distinction between early/late bound |
| 74 | +parameters (more on this later) |
| 75 | + |
| 76 | +The early/late boundness of generic parameters is only relevent for the desugaring of |
| 77 | +functions/closures into types with builtin `Fn*` impls. It does not make sense to talk about |
| 78 | +in other contexts. |
| 79 | + |
| 80 | +The `generics_of` query in rustc only contains early bound parameters. In this way it acts more |
| 81 | +like `generics_of(my_func)` is the generics for the FnDef than the generics provided to the function |
| 82 | +body although it's not clear to the author of this section if this was the actual justification for |
| 83 | +making `generics_of` behave this way. |
13 | 84 |
|
14 |
| -[quant]: ./appendix/background.md#quantified |
| 85 | +## What parameters are currently late bound |
15 | 86 |
|
16 |
| -While Rust *items* can be quantified over types, lifetimes, and constants, the |
17 |
| -types of values in Rust are only ever quantified over lifetimes. So you can |
18 |
| -have a type like `for<'a> fn(&'a u32)`, which represents a function pointer |
19 |
| -that takes a reference with any lifetime, or `for<'a> dyn Trait<'a>`, which is |
20 |
| -a `dyn` trait for a trait implemented for any lifetime; but we have no type |
21 |
| -like `for<T> fn(T)`, which would be a function that takes a value of *any type* |
22 |
| -as a parameter. This is a consequence of monomorphization -- to support a value |
23 |
| -of type `for<T> fn(T)`, we would need a single function pointer that can be |
24 |
| -used for a parameter of any type, but in Rust we generate customized code for |
25 |
| -each parameter type. |
| 87 | +Below are the current requirements for determining if a generic parameter is late bound. It is worth |
| 88 | +keeping in mind that these are not necessarily set in stone and it is almost certainly possible to |
| 89 | +be more flexible. |
26 | 90 |
|
27 |
| -One consequence of this asymmetry is a weird split in how we represent some |
28 |
| -generic types: _early-_ and _late-_ bound parameters. |
29 |
| -Basically, if we cannot represent a type (e.g. a universally quantified type), |
30 |
| -we have to bind it _early_ so that the unrepresentable type is never around. |
| 91 | +### Must be a lifetime parameter |
31 | 92 |
|
32 |
| -Consider the following example: |
| 93 | +Rust can't support types such as `for<T> dyn Trait<T>` or `for<T> fn(T)`, this is a |
| 94 | +fundamental limitation of the language as we are required to monomorphize type/const |
| 95 | +parameters and cannot do so behind dynamic dispatch. (technically we could probably |
| 96 | +support `for<T> dyn MarkerTrait<T>` as there is nothing to monomorphize) |
33 | 97 |
|
34 |
| -```rust,ignore |
35 |
| -fn foo<'a, 'b, T>(x: &'a u32, y: &'b T) where T: 'b { ... } |
| 98 | +Not being able to support `for<T> dyn Trait<T>` resulted in making all type and const |
| 99 | +parameters early bound. Only lifetime parameters can be late bound. |
| 100 | + |
| 101 | +### Must not appear in the where clauses |
| 102 | + |
| 103 | +In order for a generic parameter to be late bound it must not appear in any where clauses. |
| 104 | +This is currently an incredibly simplistic check that causes lifetimes to be early bound even |
| 105 | +if the where clause they appear in are always true, or implied by well formedness of function |
| 106 | +arguments. e.g. |
| 107 | +```rust |
| 108 | +fn foo1<'a: 'a>(_: &'a u32) {} |
| 109 | +// ^^ early bound parameter because it's in a `'a: 'a` clause |
| 110 | +// even though the bound obviously holds all the time |
| 111 | +fn foo2<'a, T: Trait<'a>(a: T, b: &'a u32) {} |
| 112 | +// ^^ early bound parameter because it's used in the `T: Trait<'a>` clause |
| 113 | +fn foo3<'a, T: 'a>(_: &'a T) {} |
| 114 | +// ^^ early bound parameter because it's used in the `T: 'a` clause |
| 115 | +// even though that bound is implied by wellformedness of `&'a T` |
| 116 | +fn foo4<'a, 'b: 'a>(_: Inv<&'a ()>, _: Inv<&'b ()>) {} |
| 117 | +// ^^ ^^ ^^^ note: |
| 118 | +// ^^ ^^ `Inv` stands for `Invariant` and is used to |
| 119 | +// ^^ ^^ make the the type parameter invariant. This |
| 120 | +// ^^ ^^ is necessary for demonstration purposes as |
| 121 | +// ^^ ^^ `for<'a, 'b> fn(&'a (), &'b ())` and |
| 122 | +// ^^ ^^ `for<'a> fn(&'a u32, &'a u32)` are subtypes- |
| 123 | +// ^^ ^^ of eachother which makes the bound trivially |
| 124 | +// ^^ ^^ satisfiable when making the fnptr. `Inv` |
| 125 | +// ^^ ^^ disables this subtyping. |
| 126 | +// ^^ ^^ |
| 127 | +// ^^^^^^ both early bound parameters because they are present in the |
| 128 | +// `'b: 'a` clause |
36 | 129 | ```
|
37 | 130 |
|
38 |
| -We cannot treat `'a`, `'b`, and `T` in the same way. Types in Rust can't have |
39 |
| -`for<T> { .. }`, only `for<'a> {...}`, so whenever you reference `foo` the type |
40 |
| -you get back can't be `for<'a, 'b, T> fn(&'a u32, y: &'b T)`. Instead, the `T` |
41 |
| -must be substituted early. In particular, you have: |
| 131 | +The reason for this requirement is that we cannot represent the `T: Trait<'a>` or `'a: 'b` clauses |
| 132 | +on a function pointer. `for<'a, 'b> fn(Inv<&'a ()>, Inv<&'b ()>)` is not a valid function pointer to |
| 133 | +represent`foo4` as it would allow calling the function without `'b: 'a` holding. |
42 | 134 |
|
43 |
| -```rust,ignore |
44 |
| -let x = foo; // T, 'b have to be substituted here |
45 |
| -x(...); // 'a substituted here, at the point of call |
46 |
| -x(...); // 'a substituted here with a different value |
| 135 | +### Must be constrained by where clauses or function argument types |
| 136 | + |
| 137 | +The builtin impls of the `Fn*` traits for closures and `FnDef`s cannot not have any unconstrained |
| 138 | +parameters. For example the following impl is illegal: |
| 139 | +```rust |
| 140 | +impl<'a> Trait for u32 { type Assoc = &'a u32; } |
| 141 | +``` |
| 142 | +We must not end up with a similar impl for the `Fn*` traits e.g. |
| 143 | +```rust |
| 144 | +impl<'a> Fn<()> for FnDef { type Assoc = &'a u32 } |
47 | 145 | ```
|
48 | 146 |
|
49 |
| -## Early-bound parameters |
| 147 | +Violating this rule can trivially lead to unsoundness as seen in [#84366](https://github.com/rust-lang/rust/issues/84366). |
| 148 | +Additionally if we ever support late bound type params then an impl like: |
| 149 | +```rust |
| 150 | +impl<T> Fn<()> for FnDef { type Assoc = T; } |
| 151 | +``` |
| 152 | +would break the compiler in various ways. |
50 | 153 |
|
51 |
| -Early-bound parameters in rustc are identified by an index, stored in the |
52 |
| -[`ParamTy`] struct for types or the [`EarlyBoundRegion`] struct for lifetimes. |
53 |
| -The index counts from the outermost declaration in scope. This means that as you |
54 |
| -add more binders inside, the index doesn't change. |
| 154 | +In order to ensure that everything functions correctly, we do not allow generic parameters to |
| 155 | +be late bound if it would result in a builtin impl that does not constrain all of the generic |
| 156 | +parameters on the builtin impl. Making a generic parameter be early bound trivially makes it be |
| 157 | +constrained by the builtin impl as it ends up on the self type. |
55 | 158 |
|
56 |
| -For example, |
| 159 | +Because of the requirement that late bound parameters must not appear in where clauses, checking |
| 160 | +this is simpler than the rules for checking impl headers constrain all the parameters on the impl. |
| 161 | +We only have to ensure that all late bound parameters appear at least once in the function argument |
| 162 | +types outside of an alias (e.g. an associated type). |
57 | 163 |
|
58 |
| -```rust,ignore |
59 |
| -trait Foo<T> { |
60 |
| - type Bar<U> = (Self, T, U); |
61 |
| -} |
| 164 | +The requirement that they not indirectly be in the substs of an alias for it to count is the |
| 165 | +same as why the follow code is forbidden: |
| 166 | +```rust |
| 167 | +impl<T: Trait> OtherTrait for <T as Trait>::Assoc { type Assoc = T } |
62 | 168 | ```
|
| 169 | +There is no guarantee that `<T as Trait>::Assoc` will normalize to different types for every |
| 170 | +instantiation of `T`. If we were to allow this impl we could get overlapping impls and the |
| 171 | +same is true of the builtin `Fn*` impls. |
| 172 | + |
| 173 | +## Making more generic parameters late bound |
| 174 | + |
| 175 | +It is generally considered desirable for more parameters to be late bound as it makes |
| 176 | +the builtin `Fn*` impls more flexible. Right now many of the requirements for making |
| 177 | +a parameter late bound are overly restrictive as they are tied to what we can currently |
| 178 | +(or can ever) do with fn ptrs. |
| 179 | + |
| 180 | +It would be theoretically possible to support late bound params in `where`-clauses in the |
| 181 | +language by introducing implication types which would allow us to express types such as: |
| 182 | +`for<'a, 'b: 'a> fn(Inv<&'a u32>, Inv<&'b u32>)` which would ensure `'b: 'a` is upheld when |
| 183 | +calling the function pointer. |
| 184 | + |
| 185 | +It would also be theoretically possible to support it by making the coercion to a fn ptr |
| 186 | +instantiate the parameter with an infer var while still allowing the FnDef to not have the |
| 187 | +generic parameter present as trait impls are perfectly capable of representing the where clauses |
| 188 | +on the function on the impl itself. This would also allow us to support late bound type/const |
| 189 | +vars allowing bounds like `F: for<T> Fn(T)` to hold. |
| 190 | + |
| 191 | +It is almost somewhat unclear if we can change the `Fn` traits to be structured differently |
| 192 | +so that we never have to make a parameter early bound just to make the builtin impl have all |
| 193 | +generics be constrained. Of all the possible causes of a generic parameter being early bound |
| 194 | +this seems the most difficult to remove. |
| 195 | + |
| 196 | +Whether these would be good ideas to implement is a separate question- they are only brought |
| 197 | +up to illustrate that the current rules are not necessarily set in stone and a result of |
| 198 | +"its the only way of doing this". |
63 | 199 |
|
64 |
| -Here, the type `(Self, T, U)` would be `($0, $1, $2)`, where `$N` means a |
65 |
| -[`ParamTy`] with the index of `N`. |
66 |
| - |
67 |
| -In rustc, the [`Generics`] structure carries this information. So the |
68 |
| -[`Generics`] for `Bar` above would be just like for `U` and would indicate the |
69 |
| -'parent' generics of `Foo`, which declares `Self` and `T`. You can read more |
70 |
| -in [this chapter](./generics.md). |
71 |
| - |
72 |
| -[`ParamTy`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.ParamTy.html |
73 |
| -[`EarlyBoundRegion`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBoundRegion.html |
74 |
| -[`Generics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Generics.html |
75 |
| - |
76 |
| -## Late-bound parameters |
77 |
| - |
78 |
| -Late-bound parameters in `rustc` are handled quite differently (they are also |
79 |
| -specialized to lifetimes since, right now, only late-bound lifetimes are |
80 |
| -supported, though with GATs that has to change). We indicate their potential |
81 |
| -presence by a [`Binder`] type. The [`Binder`] doesn't know how many variables |
82 |
| -there are at that binding level. This can only be determined by walking the |
83 |
| -type itself and collecting them. So a type like `for<'a, 'b> ('a, 'b)` would be |
84 |
| -`for (^0.a, ^0.b)`. Here, we just write `for` because we don't know the names |
85 |
| -of the things bound within. |
86 |
| - |
87 |
| -Moreover, a reference to a late-bound lifetime is written `^0.a`: |
88 |
| - |
89 |
| -- The `0` is the index; it identifies that this lifetime is bound in the |
90 |
| - innermost binder (the `for`). |
91 |
| -- The `a` is the "name"; late-bound lifetimes in rustc are identified by a |
92 |
| - "name" -- the [`BoundRegionKind`] enum. This enum can contain a |
93 |
| - [`DefId`][defid] or it might have various "anonymous" numbered names. The |
94 |
| - latter arise from types like `fn(&u32, &u32)`, which are equivalent to |
95 |
| - something like `for<'a, 'b> fn(&'a u32, &'b u32)`, but the names of those |
96 |
| - lifetimes must be generated. |
97 |
| - |
98 |
| -This setup of not knowing the full set of variables at a binding level has some |
99 |
| -advantages and some disadvantages. The disadvantage is that you must walk the |
100 |
| -type to find out what is bound at the given level and so forth. The advantage |
101 |
| -is primarily that, when constructing types from Rust syntax, if we encounter |
102 |
| -anonymous regions like in `fn(&u32)`, we just create a fresh index and don't have |
103 |
| -to update the binder. |
104 |
| - |
105 |
| -[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html |
106 |
| -[`BoundRegionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.BoundRegionKind.html |
107 |
| -[defid]: ./hir.html#identifiers-in-the-hir |
|
0 commit comments