Skip to content

Commit 59f1356

Browse files
BoxyUwUcompiler-errors
authored andcommitted
split file and replace with deep dive doc
1 parent 1726dad commit 59f1356

File tree

3 files changed

+242
-88
lines changed

3 files changed

+242
-88
lines changed

src/SUMMARY.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,9 +109,10 @@
109109
- [`TypeFolder` and `TypeFoldable`](./ty-fold.md)
110110
- [Generic arguments](./generic_arguments.md)
111111
- [Constants in the type system](./constants.md)
112+
- [Bound vars and Parameters](./bound-vars-and-params.md)
112113
- [Type inference](./type-inference.md)
113114
- [Trait solving](./traits/resolution.md)
114-
- [Early and Late Bound Parameters](./early-late-bound.md)
115+
- [Early and Late Bound Parameter Definitions](./early-late-bound.md)
115116
- [Higher-ranked trait bounds](./traits/hrtb.md)
116117
- [Caching subtleties](./traits/caching.md)
117118
- [Specialization](./traits/specialization.md)

src/bound-vars-and-params.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Bound vars and parameters
2+
3+
## Early-bound parameters
4+
5+
Early-bound parameters in rustc are identified by an index, stored in the
6+
[`ParamTy`] struct for types or the [`EarlyBoundRegion`] struct for lifetimes.
7+
The index counts from the outermost declaration in scope. This means that as you
8+
add more binders inside, the index doesn't change.
9+
10+
For example,
11+
12+
```rust,ignore
13+
trait Foo<T> {
14+
type Bar<U> = (Self, T, U);
15+
}
16+
```
17+
18+
Here, the type `(Self, T, U)` would be `($0, $1, $2)`, where `$N` means a
19+
[`ParamTy`] with the index of `N`.
20+
21+
In rustc, the [`Generics`] structure carries this information. So the
22+
[`Generics`] for `Bar` above would be just like for `U` and would indicate the
23+
'parent' generics of `Foo`, which declares `Self` and `T`. You can read more
24+
in [this chapter](./generics.md).
25+
26+
[`ParamTy`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.ParamTy.html
27+
[`EarlyBoundRegion`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBoundRegion.html
28+
[`Generics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Generics.html
29+
30+
## Late-bound parameters
31+
32+
Late-bound parameters in `rustc` are handled quite differently (they are also
33+
specialized to lifetimes since, right now, only late-bound lifetimes are
34+
supported, though with GATs that has to change). We indicate their potential
35+
presence by a [`Binder`] type. The [`Binder`] doesn't know how many variables
36+
there are at that binding level. This can only be determined by walking the
37+
type itself and collecting them. So a type like `for<'a, 'b> ('a, 'b)` would be
38+
`for (^0.a, ^0.b)`. Here, we just write `for` because we don't know the names
39+
of the things bound within.
40+
41+
Moreover, a reference to a late-bound lifetime is written `^0.a`:
42+
43+
- The `0` is the index; it identifies that this lifetime is bound in the
44+
innermost binder (the `for`).
45+
- The `a` is the "name"; late-bound lifetimes in rustc are identified by a
46+
"name" -- the [`BoundRegionKind`] enum. This enum can contain a
47+
[`DefId`][defid] or it might have various "anonymous" numbered names. The
48+
latter arise from types like `fn(&u32, &u32)`, which are equivalent to
49+
something like `for<'a, 'b> fn(&'a u32, &'b u32)`, but the names of those
50+
lifetimes must be generated.
51+
52+
This setup of not knowing the full set of variables at a binding level has some
53+
advantages and some disadvantages. The disadvantage is that you must walk the
54+
type to find out what is bound at the given level and so forth. The advantage
55+
is primarily that, when constructing types from Rust syntax, if we encounter
56+
anonymous regions like in `fn(&u32)`, we just create a fresh index and don't have
57+
to update the binder.
58+
59+
[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html
60+
[`BoundRegionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.BoundRegionKind.html
61+
[defid]: ./hir.html#identifiers-in-the-hir

src/early-late-bound.md

Lines changed: 179 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -1,107 +1,199 @@
1-
# Early and Late Bound Variables
1+
# Early and Late Bound Parameter Definitions
22

3-
In Rust, item definitions (like `fn`) can often have generic parameters, which
4-
are always [_universally_ quantified][quant]. That is, if you have a function
5-
like
3+
Understanding this page likely requires a rudimentary understanding of higher ranked
4+
trait bounds/`for<'a>`and also what types such as `dyn for<'a> Trait<'a>` and
5+
`for<'a> fn(&'a u32)` mean. Reading [the nomincon chapter](https://doc.rust-lang.org/nomicon/hrtb.html)
6+
on HRTB may be useful for understanding this syntax. The meaning of `for<'a> fn(&'a u32)`
7+
is incredibly similar to the meaning of `T: for<'a> Trait<'a>`.
68

9+
If you are looking for information on the `RegionKind` variants `ReLateBound` and `ReEarlyBound`
10+
you should look at the section on [bound vars and params](./bound-vars-and-params.md). This section
11+
discusses what makes generic parameters on functions and closures late/early bound. Not the general
12+
concept of bound vars and generic parameters which `RegionKind` has named somewhat confusingly
13+
with this topic.
14+
15+
## What does it mean for parameters to be early or late bound
16+
17+
All function definitions conceptually have a zst (this is represented by `TyKind::FnDef` in rustc).
18+
The only generics on this zst are the early bound parameters of the function definition. e.g.
719
```rust
8-
fn foo<T>(x: T) { }
20+
fn foo<'a>(_: &'a u32) {}
21+
22+
fn main() {
23+
let b = foo;
24+
// ^ `b` has type `FnDef(foo, [])` (no substs because `'a` is late bound)
25+
assert!(std::mem::size_of_val(&b) == 0);
26+
}
927
```
1028

11-
this function is defined "for all T" (not "for some specific T", which would be
12-
[_existentially_ quantified][quant]).
29+
In order to call `b` the late bound parameters do need to be provided, these are inferred at the
30+
call site instead of when we refer to `foo`.
31+
```rust
32+
fn main() {
33+
let b = foo;
34+
let a: &'static u32 = &10;
35+
foo(a);
36+
// the lifetime argument for `'a` on `foo` is inferred at the callsite
37+
// the generic parameter `'a` on `foo` is inferred to `'static` here
38+
}
39+
```
40+
41+
Because late bound parameters are not part of the `FnDef`'s substs this allows us to prove trait
42+
bounds such as `F: for<'a> Fn(&'a u32)` where `F` is `foo`'s `FnDef`. e.g.
43+
```rust
44+
fn foo_early<'a, T: Trait<'a>>(_: &'a u32, _: T) {}
45+
fn foo_late<'a, T>(_: &'a u32, _: T) {}
46+
47+
fn accepts_hr_func<F: for<'a> Fn(&'a u32, u32)>(_: F) {}
48+
49+
fn main() {
50+
// doesnt work, the substituted bound is `for<'a> FnDef<'?0>: Fn(&'a u32, u32)`
51+
// `foo_early` only implements `for<'a> FnDef<'a>: Fn(&'a u32, u32)`- the lifetime
52+
// of the borrow in the function argument must be the same as the lifetime
53+
// on the `FnDef`.
54+
accepts_hr_func(foo_early);
55+
56+
// works, the substituted bound is `for<'a> FnDef: Fn(&'a u32, u32)`
57+
accepts_hr_func(foo_late);
58+
}
59+
60+
// the builtin `Fn` impls for `foo_early` and `foo_late` look something like:
61+
// `foo_early`
62+
impl<'a, T: Trait<'a>> Fn(&'a u32, T) for FooEarlyFnDef<'a, T> { ... }
63+
// `foo_late`
64+
impl<'a, T> Fn(&'a u32, T) for FooLateFnDef<T> { ... }
65+
66+
```
67+
68+
Early bound parameters are present on the `FnDef`. Late bound generic parameters are not present
69+
on the `FnDef` but are instead constrained by the builtin `Fn*` impl.
70+
71+
The same distinction applies to closures. Instead of `FnDef` we are talking about the anonymous
72+
closure type. Closures are [currently unsound](https://github.com/rust-lang/rust/issues/84366) in
73+
ways that are closely related to the distinction between early/late bound
74+
parameters (more on this later)
75+
76+
The early/late boundness of generic parameters is only relevent for the desugaring of
77+
functions/closures into types with builtin `Fn*` impls. It does not make sense to talk about
78+
in other contexts.
79+
80+
The `generics_of` query in rustc only contains early bound parameters. In this way it acts more
81+
like `generics_of(my_func)` is the generics for the FnDef than the generics provided to the function
82+
body although it's not clear to the author of this section if this was the actual justification for
83+
making `generics_of` behave this way.
1384

14-
[quant]: ./appendix/background.md#quantified
85+
## What parameters are currently late bound
1586

16-
While Rust *items* can be quantified over types, lifetimes, and constants, the
17-
types of values in Rust are only ever quantified over lifetimes. So you can
18-
have a type like `for<'a> fn(&'a u32)`, which represents a function pointer
19-
that takes a reference with any lifetime, or `for<'a> dyn Trait<'a>`, which is
20-
a `dyn` trait for a trait implemented for any lifetime; but we have no type
21-
like `for<T> fn(T)`, which would be a function that takes a value of *any type*
22-
as a parameter. This is a consequence of monomorphization -- to support a value
23-
of type `for<T> fn(T)`, we would need a single function pointer that can be
24-
used for a parameter of any type, but in Rust we generate customized code for
25-
each parameter type.
87+
Below are the current requirements for determining if a generic parameter is late bound. It is worth
88+
keeping in mind that these are not necessarily set in stone and it is almost certainly possible to
89+
be more flexible.
2690

27-
One consequence of this asymmetry is a weird split in how we represent some
28-
generic types: _early-_ and _late-_ bound parameters.
29-
Basically, if we cannot represent a type (e.g. a universally quantified type),
30-
we have to bind it _early_ so that the unrepresentable type is never around.
91+
### Must be a lifetime parameter
3192

32-
Consider the following example:
93+
Rust can't support types such as `for<T> dyn Trait<T>` or `for<T> fn(T)`, this is a
94+
fundamental limitation of the language as we are required to monomorphize type/const
95+
parameters and cannot do so behind dynamic dispatch. (technically we could probably
96+
support `for<T> dyn MarkerTrait<T>` as there is nothing to monomorphize)
3397

34-
```rust,ignore
35-
fn foo<'a, 'b, T>(x: &'a u32, y: &'b T) where T: 'b { ... }
98+
Not being able to support `for<T> dyn Trait<T>` resulted in making all type and const
99+
parameters early bound. Only lifetime parameters can be late bound.
100+
101+
### Must not appear in the where clauses
102+
103+
In order for a generic parameter to be late bound it must not appear in any where clauses.
104+
This is currently an incredibly simplistic check that causes lifetimes to be early bound even
105+
if the where clause they appear in are always true, or implied by well formedness of function
106+
arguments. e.g.
107+
```rust
108+
fn foo1<'a: 'a>(_: &'a u32) {}
109+
// ^^ early bound parameter because it's in a `'a: 'a` clause
110+
// even though the bound obviously holds all the time
111+
fn foo2<'a, T: Trait<'a>(a: T, b: &'a u32) {}
112+
// ^^ early bound parameter because it's used in the `T: Trait<'a>` clause
113+
fn foo3<'a, T: 'a>(_: &'a T) {}
114+
// ^^ early bound parameter because it's used in the `T: 'a` clause
115+
// even though that bound is implied by wellformedness of `&'a T`
116+
fn foo4<'a, 'b: 'a>(_: Inv<&'a ()>, _: Inv<&'b ()>) {}
117+
// ^^ ^^ ^^^ note:
118+
// ^^ ^^ `Inv` stands for `Invariant` and is used to
119+
// ^^ ^^ make the the type parameter invariant. This
120+
// ^^ ^^ is necessary for demonstration purposes as
121+
// ^^ ^^ `for<'a, 'b> fn(&'a (), &'b ())` and
122+
// ^^ ^^ `for<'a> fn(&'a u32, &'a u32)` are subtypes-
123+
// ^^ ^^ of eachother which makes the bound trivially
124+
// ^^ ^^ satisfiable when making the fnptr. `Inv`
125+
// ^^ ^^ disables this subtyping.
126+
// ^^ ^^
127+
// ^^^^^^ both early bound parameters because they are present in the
128+
// `'b: 'a` clause
36129
```
37130

38-
We cannot treat `'a`, `'b`, and `T` in the same way. Types in Rust can't have
39-
`for<T> { .. }`, only `for<'a> {...}`, so whenever you reference `foo` the type
40-
you get back can't be `for<'a, 'b, T> fn(&'a u32, y: &'b T)`. Instead, the `T`
41-
must be substituted early. In particular, you have:
131+
The reason for this requirement is that we cannot represent the `T: Trait<'a>` or `'a: 'b` clauses
132+
on a function pointer. `for<'a, 'b> fn(Inv<&'a ()>, Inv<&'b ()>)` is not a valid function pointer to
133+
represent`foo4` as it would allow calling the function without `'b: 'a` holding.
42134

43-
```rust,ignore
44-
let x = foo; // T, 'b have to be substituted here
45-
x(...); // 'a substituted here, at the point of call
46-
x(...); // 'a substituted here with a different value
135+
### Must be constrained by where clauses or function argument types
136+
137+
The builtin impls of the `Fn*` traits for closures and `FnDef`s cannot not have any unconstrained
138+
parameters. For example the following impl is illegal:
139+
```rust
140+
impl<'a> Trait for u32 { type Assoc = &'a u32; }
141+
```
142+
We must not end up with a similar impl for the `Fn*` traits e.g.
143+
```rust
144+
impl<'a> Fn<()> for FnDef { type Assoc = &'a u32 }
47145
```
48146

49-
## Early-bound parameters
147+
Violating this rule can trivially lead to unsoundness as seen in [#84366](https://github.com/rust-lang/rust/issues/84366).
148+
Additionally if we ever support late bound type params then an impl like:
149+
```rust
150+
impl<T> Fn<()> for FnDef { type Assoc = T; }
151+
```
152+
would break the compiler in various ways.
50153

51-
Early-bound parameters in rustc are identified by an index, stored in the
52-
[`ParamTy`] struct for types or the [`EarlyBoundRegion`] struct for lifetimes.
53-
The index counts from the outermost declaration in scope. This means that as you
54-
add more binders inside, the index doesn't change.
154+
In order to ensure that everything functions correctly, we do not allow generic parameters to
155+
be late bound if it would result in a builtin impl that does not constrain all of the generic
156+
parameters on the builtin impl. Making a generic parameter be early bound trivially makes it be
157+
constrained by the builtin impl as it ends up on the self type.
55158

56-
For example,
159+
Because of the requirement that late bound parameters must not appear in where clauses, checking
160+
this is simpler than the rules for checking impl headers constrain all the parameters on the impl.
161+
We only have to ensure that all late bound parameters appear at least once in the function argument
162+
types outside of an alias (e.g. an associated type).
57163

58-
```rust,ignore
59-
trait Foo<T> {
60-
type Bar<U> = (Self, T, U);
61-
}
164+
The requirement that they not indirectly be in the substs of an alias for it to count is the
165+
same as why the follow code is forbidden:
166+
```rust
167+
impl<T: Trait> OtherTrait for <T as Trait>::Assoc { type Assoc = T }
62168
```
169+
There is no guarantee that `<T as Trait>::Assoc` will normalize to different types for every
170+
instantiation of `T`. If we were to allow this impl we could get overlapping impls and the
171+
same is true of the builtin `Fn*` impls.
172+
173+
## Making more generic parameters late bound
174+
175+
It is generally considered desirable for more parameters to be late bound as it makes
176+
the builtin `Fn*` impls more flexible. Right now many of the requirements for making
177+
a parameter late bound are overly restrictive as they are tied to what we can currently
178+
(or can ever) do with fn ptrs.
179+
180+
It would be theoretically possible to support late bound params in `where`-clauses in the
181+
language by introducing implication types which would allow us to express types such as:
182+
`for<'a, 'b: 'a> fn(Inv<&'a u32>, Inv<&'b u32>)` which would ensure `'b: 'a` is upheld when
183+
calling the function pointer.
184+
185+
It would also be theoretically possible to support it by making the coercion to a fn ptr
186+
instantiate the parameter with an infer var while still allowing the FnDef to not have the
187+
generic parameter present as trait impls are perfectly capable of representing the where clauses
188+
on the function on the impl itself. This would also allow us to support late bound type/const
189+
vars allowing bounds like `F: for<T> Fn(T)` to hold.
190+
191+
It is almost somewhat unclear if we can change the `Fn` traits to be structured differently
192+
so that we never have to make a parameter early bound just to make the builtin impl have all
193+
generics be constrained. Of all the possible causes of a generic parameter being early bound
194+
this seems the most difficult to remove.
195+
196+
Whether these would be good ideas to implement is a separate question- they are only brought
197+
up to illustrate that the current rules are not necessarily set in stone and a result of
198+
"its the only way of doing this".
63199

64-
Here, the type `(Self, T, U)` would be `($0, $1, $2)`, where `$N` means a
65-
[`ParamTy`] with the index of `N`.
66-
67-
In rustc, the [`Generics`] structure carries this information. So the
68-
[`Generics`] for `Bar` above would be just like for `U` and would indicate the
69-
'parent' generics of `Foo`, which declares `Self` and `T`. You can read more
70-
in [this chapter](./generics.md).
71-
72-
[`ParamTy`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.ParamTy.html
73-
[`EarlyBoundRegion`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBoundRegion.html
74-
[`Generics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Generics.html
75-
76-
## Late-bound parameters
77-
78-
Late-bound parameters in `rustc` are handled quite differently (they are also
79-
specialized to lifetimes since, right now, only late-bound lifetimes are
80-
supported, though with GATs that has to change). We indicate their potential
81-
presence by a [`Binder`] type. The [`Binder`] doesn't know how many variables
82-
there are at that binding level. This can only be determined by walking the
83-
type itself and collecting them. So a type like `for<'a, 'b> ('a, 'b)` would be
84-
`for (^0.a, ^0.b)`. Here, we just write `for` because we don't know the names
85-
of the things bound within.
86-
87-
Moreover, a reference to a late-bound lifetime is written `^0.a`:
88-
89-
- The `0` is the index; it identifies that this lifetime is bound in the
90-
innermost binder (the `for`).
91-
- The `a` is the "name"; late-bound lifetimes in rustc are identified by a
92-
"name" -- the [`BoundRegionKind`] enum. This enum can contain a
93-
[`DefId`][defid] or it might have various "anonymous" numbered names. The
94-
latter arise from types like `fn(&u32, &u32)`, which are equivalent to
95-
something like `for<'a, 'b> fn(&'a u32, &'b u32)`, but the names of those
96-
lifetimes must be generated.
97-
98-
This setup of not knowing the full set of variables at a binding level has some
99-
advantages and some disadvantages. The disadvantage is that you must walk the
100-
type to find out what is bound at the given level and so forth. The advantage
101-
is primarily that, when constructing types from Rust syntax, if we encounter
102-
anonymous regions like in `fn(&u32)`, we just create a fresh index and don't have
103-
to update the binder.
104-
105-
[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html
106-
[`BoundRegionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.BoundRegionKind.html
107-
[defid]: ./hir.html#identifiers-in-the-hir

0 commit comments

Comments
 (0)