Description
This may be the spiciest proposal I ever write. oh man was i wrong, everyone seems to love this one?
Background
Peer Type Resolution (PTR) is a mechanism to combine a set of arbitrarily many types into a final type which all of the inputs can coerce to. For instance, u8
and u16
peer resolve to u16
, while *volatile [4:0]u8
and *const [7:0]u8
peer resolve to [:0]const volatile u8
.
The main purpose of PTR is to combine the results of different control flow branches. Opposing branches of a control flow construct are known as "peers", and PTR is applied to the results of those branches to figure out the result of the entire construct. For example:
var t = true; // runtime-known
test {
const a: *const volatile [4:0]u8 = "abcd";
const b: *const [7:0]u8 = "abcdefg";
const x = if (t) a else b;
@compileLog(@TypeOf(x)); // [:0]const volatile u8
}
The output here shows that the if
statement applied Peer Type Resolution to the two "peer expressions" a
and b
.
This issue will propose removing Peer Type Resolution from Zig. However, we must first note that Peer Type Resolution is also used in three other places in Zig.
- Binary operators like
+
and&
, including some builtins like@addWithOverflow
, apply PTR to their operands. This proposal does not propose changing these operators. It is likely that their typing rules will change in the future anyway (see allow integer types to be any range #3806, introduce a compile error for when a result location provides integer widening ambiguity #16310), but even if they don't, these operators require only a very small subset of the behavior of modern PTR. - Passing multiple arguments to the
@TypeOf
builtin. In this case, the builtin will apply PTR to the types of the operands. This proposal does suggest removing this functionality. - When
switch
ing on a tagged union, a prong containing multiple items with distinct payload types apply PTR to determine the type of the capture; e.g.switch (u) { .my_u32, .my_u16 => |u32_val| ... }
. This proposal does suggest removing this functionality.
When we get into the meat of the proposal, I'll discuss these cases in a little more detail.
Problems with PTR
Peer Type Resolution, while commonly accepted as a part of Zig, actually has some problems.
Firstly, it leads to a common way for semantics to differ between runtime and comptime execution. For instance, consider this code:
fn f(b: bool) u32 {
const opt = if (b) null else @as(u32, 123);
if (opt) |x|return x;
return 0;
}
This function is fairly straightforward, if quite esoteric. But the key point here is: what is the type of opt
? When f
is called at runtime, the type is determined using PTR; the peers have types @Type(.null)
and u32
, which peer resolve to ?u32
, so opt
has type ?u32
. This makes the following if
statement work as expected. However, if f
is called at comptime, PTR is not used, because the not-taken branch of the first conditional is not evaluated. This is a very useful feature of Zig, and not one that should change; but it means that the type opt
is either @Type(.null)
or u32
, depending on the comptime-known value of b
. In the u32
case (i.e. comptime f(false)
), this causes the second if
statement to emit a compile error, because u32
is not an optional type, so this construct is invalid! So, this code emits a compile error when called at comptime.
The second issue is that it can unintuitive impacts on comptime-only types. The statement const x = if (b) 1 else 2
is invalid at runtime in Zig, because the resoved type of x
is comptime_int
, and its value depends on runtime control flow (the if
expression), so we have a value of a comptime-only type depending on runtime control flow, which is disallowed. However, this error goes away if one of the peers has a concrete integer type -- for instance, const x = if (b) 1 else @as(u32, 2)
works fine, because the first peer is coerced to u32
which can exist at runtime. This kind of "spooky action at a distance" can be confusing for new Zig users.
Lastly, it can hinder readability. Consider these definitions:
const x = if (b) "hi" else "world";
const y = if (b) "hello" else "world";
What are the types of x
and y
? If you said [:0]const u8
, you're actually wrong; that's the type of x
, but the peers of y
are strings of the same length, so the types peer resolve to *const [5:0]u8
. In this case, that's probably not a huge deal, but you can imagine it being more confusing when, say, integer widening is involved, or more significant pointer qualifiers like volatile
. In these cases, it would be more clear to annotate the type of the variables. This can also help to clarify what properties of the type your code depends on; for instance, a user might annotate the type of x
as []const u8
, because the null terminator doesn't matter for their use case.
Proposal
Remove Peer Type Resolution from the language. The features of Zig which utilize it are changed as follows:
- Expressions combining peers (e.g.
if
expressions,switch
expressions, labeled blocks) emit a compile error if all peers are not of the same type (excludingnoreturn
peers). - Binary operators using PTR are unchanged; they can use a simple, specialized form of the algorithm.
@TypeOf
accepts only one operand, making this proposal a reversal of @typeOf() should take multiple parameters and do peer type resolution #439.switch
on a tagged union requires all payloads to have the same type, making this proposal a reversal of when multiple union fields share a body in a switch prong, use a peer result type cast, rather than requiring identical types #2812.
This change resolves all three of the issues described above:
- Since all peers must have the same type, it's fine that comptime only evaluates one; the expression will have the same type. This eliminates the need for Proposal: Improve type inference of
if
/switch
with comptime-known operand #13025, which would be a complex language change to fix this issue. This would also solve Catch null in if-condition cails at comptime. #5462, which is the same issue. - It would no longer be possible for e.g.
comptime_int
to implicitly become a runtime type due to the type of a peer; if you intend for the result to be e.g. au32
, you would have to coerce all peers. More realistically, you would annotate the type outside of the expression; more on this in a second. - The type of an expression would have to be the same across all peers, making it hopefully obvious. In cases where the types are not identical, you would use an explicit type annotation; again, more on this below.
This proposal also simplifies the language in general, which is a nice plus.
The effect of this proposal on user code would be to encourage more type annotations in places where types are non-obvious. This style of including type annotations where possible is something Zig has been moving towards in recent years:
- We have an "unofficial" preference for
const x: T = .{ ... }
overconst x = T{ ... }
. - Decl literals encourage writing
const x: T = .foo
rather thanconst x = T.foo
. - The "new" (not that new anymore) casting builtins encourage explicit type annotations by sometimes requiring them around type casts.
The advantages of explicit type annotations are as follows:
- It increases readability for humans, since it becomes easier to know what types different expressions have; in particular, giving local variables type annotations can make the variables' uses easier to understand.
- It is useful to tooling acting on Zig source code; for instance, language servers or documentation tooling which is performing a "best effort" interpretation of code without full semantic analysis capabilities can know more types with certainty, just like how humans can.
- For container-level declarations, it increases the ability of the compiler to be parallelized, since the type can be determined while queuing value resolution for later.
- For container-level declarations, it will allow self-reference and mutual references; see ability for global variables to be initialized with address of each other #131.
With all of these in mind, it's pretty clear that type annotations are a Good Thing, and I tend to support features which encourage more of them (within reason). I think this proposal probably falls within that category.
Impact on Real Code
This proposal will almost certainly cause a lot of breakage in the wild, including in the standard library. As I see it, the main question will be whether the diffs required to fix these breakages make code more or less readable. I strongly suspect the answer is that code will become more readable. However, I think we will have to implement this in the compiler (which would be relatively straightforward) and take a look at some of what breaks in a large codebase, probably the standard library and the compiler itself.
EDITS BELOW
Clarification: Result Types
This proposal never affects semantics when an expression has a result type. For instance, this code still works:
const x: u32 = if (b) 123 else my_u16;
Here, even though the peers have types comptime_int
and u16
, the result type of u32
is propagated to these expressions and is applied before the values "exit" the conditional branch. This code working is actually a key motivation for this proposal: it encourages adding type annotations like this.
Discussion: catch
and orelse
Under this proposal as written, the following code would fail to compile:
const E = enum { a, b, c };
fn getE() ?E { ... }
test {
const result = getE() orelse .c;
_ = result;
}
That's because the orelse
statement currently applies Peer Type Resolution to the types E
and @Type(.enum_literal)
. Without PTR, these types would not match. The same applies to catch
.
However, if this proposal is accepted, this code actually can work; not through PTR, but by providing a result type to the RHS. If we call ?T
the type of the LHS after being evaluated, then the RHS can be evaluated with result type T
; this is acceptable because under this proposal, it would need to have type T
anyway for the peers to successfully combine. Again, the same thing applies to catch
.
To be honest, I could see an argument that this isn't desirable, and that the above snippet should indeed require a type annotation on result
. But it's a possibility nonetheless.
Discussion: Ranged Integers
One potential downside to this proposal is that it could make #3806 significantly more difficult to work with. For instance, consider this code:
const x: u8 = something;
const y = if (b) x else x + 1;
Under #3806 with PTR, y
has type @Int(0, 257)
, since PTR is applied to the peer types @Int(0. 256)
and @Int(1, 257)
. However, this proposal would cause this code to emit a compile error, because the peer types differ. That could be a big problem, since it could cancel out some of the benefits of implicit range expansion by requiring explicit type annotations.
Assuming this is indeed awkward in practice, I'm not sure if there's a good way to reconcile these two proposals. This gives way to a counter-proposal...
Counter-proposal: Restrict PTR to Numeric Types
Instead of eliminating PTR altogether, we could potentially just nerf it a lot. Here's what I would suggest:
- PTR of integers combines integer bounds (under allow integer types to be any range #3806)
- PTR of floats selects the largest float type, like today
- No other types peer resolve
This refocuses PTR to be about combining numeric types. This restriction still solves the problems discussed in the original issue, whilst avoiding conflicting with #3806:
- It wouldn't really matter that comptime evaluation only evaluates one peer: the only thing that could differ between runtime and comptime is an exact integer or float type. The former could not have any effect on semantics, aside from explicitly depending on
@TypeOf(expr)
. The latter could affect floating-point precision/rounding, but it seems reasonable that if you need precise details of one floating-point type, you should be annotating it anywhere where it's unclear. - Under ranged integers,
comptime_int
ceases to exist anyway, so this case where adding a runtime peer makes runtime evaluation work doesn't exist. It might still exist for floats if we allowcomptime_float
to peer resolve, which we probably should. This is a minor downside to this counter-proposal. - One property of ranged integers is that exact types don't actually matter that much, so it wouldn't necessarily be an issue that this limited PTR can make those types non-obvious. Likewise, for floating-point types, it rarely matters too much which exact type is being used; where it does, it again seems reasonable to expect annotations anyway.