Skip to content

introduce a compile error for when a result location provides integer widening ambiguity #16310

Open
@lerno

Description

@lerno

This is a different proposal than #7967. The idea is to give Zig integer promotion semantics that is safe from spooky action at a distance.

The problem

Given some code:

var a: i32 = ...;
var b: i32 = ...;
// Lots of code in between
var c: i32 = a + b; // Overflow exceeding i32

We can break the last line by changing the type of a and b

var a: i8 = ...;
var b: i8 = ...;
// Lots of code in between
var c: i32 = a + b; // Overflow exceeding i8

Note here that a and b does not need to be variables, but can be things like fields in a struct, possibly an anonymous one resulting from a call.

The lack of safety comes from the binary expression, we note that var c: i32 = a; is completely safe. It is only when there is an operation in which the subexpression's type matters that the problem arises. Stated in a different way, the problem arises whenever the widening may occur in more than one semantically different way. So in this case, we either (1) widen a and b to i32, then add or (2) first add a and b then widen to i32. The ambiguity is what causes the "spooky action at a distance" when changing a and b.

While increasing the width of a and b to - for example - i64 would cause a compilation error, a narrowing to i16 like in the example is impossible for the compiler to detect, as both possibilities may be desired.

The proposed solution

  1. First we define "simple" vs "non-simple" expressions. A simple expression is one that can only be widened in one way: a constant, a variable, a call, a dereference, taking the address of a value. A "non-simple" expression is typically any binary expression, negation, bit negation, ternary etc. The full list is fairly easy to work out.
  2. When encountering a situation where widening should occur, check the (constant folded) expression whether is is simple or non-simple. The latter is a compile time error that can be fixed with a cast.

Other solutions

1. Last step widening

This is the status quo, corresponding to always doing the widening at the last step, when it's needed. In this case, it would mean a + b is done first, then the widening. This is what C does as well, but there is a big difference: C first does widening to int/unsigned, which makes this only dangerous if the type widened to is wider than int, i.e. size_t, long, long long etc. Because this is less common, it's a less common problem. Nevertheless, this is a source of security vulnerabilities and bugs.

2. Push down type widening

In this solution, the widening type is "pushed down" into the sub expressions. In our case above, this would mean a and b are first widened, then added, then the widening is applied. While this solves the issue at hand, it also creates some very subtle changes to an expression when the left hand side is widened. For example, say we want c = a << b and we deliberately picked a to be u16? If so then we must now try to explicitly opt out of the widening to c's type. Pushing down the type is also not 100% trivial.

C3 tested this type of widening and the killer problem was how to understand the actual type of sub expressions and whether they would be top down widened or not. It was not retained.

Summary

Disallowing implicit widening of sub expressions seems like a simple change which prevents vulnerabilities in the current Zig integer promotion semantics that cannot be detected by the compiler.

Example

c = a + b * (d + e);
  1. We first start by checking d and e. If they do not have the same size, we try to widen the other expression. As d and e are both simple expressions. This is always allowed.
  2. We then check b and (d + e) if b needs to be widened, this is allowed, but if b is wider than (d + e) this is an error.
  3. We then check a and b * (d + e), if a needs to be widened, this is allowed, but if it is wider than b * (d + e) this is an error.
  4. We finally check c and a + b * (d + e), since a + b * (d + e) is non-simple, we don't allow any widening of it. Nor may it be wider than c, since Zig does not allow implicit narrowing.

Example with constant folding

c = a + b * (1 + 5);
  1. First we constant fold 1 + 5 to 6
  2. b * 6 is resolved by typing 6 to the type of b
  3. a is compared to the type of b * 6 (which is the type of b) if it is wider it is an error (because b * 6 is not simple), if it is more narrow, a is widened.
  4. c is checked whether it has the type of a + b * 6 (i.e. does c and b have the same type) otherwise it is an error.

Example with simple expressions

c = b;
  1. If b is more narrow than c widen it. If c is more narrow than b then this is an error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions