Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sema: rewrite comptime arithmetic #23177

Merged
merged 1 commit into from
Mar 16, 2025
Merged

Conversation

mlugg
Copy link
Member

@mlugg mlugg commented Mar 10, 2025

Hey @kristoff-it, this is what that "simple little one-line fix" evolved into! Isn't that fun?


This commit reworks how Sema handles arithmetic on comptime-known values, fixing many bugs in the process.

The general pattern is that arithmetic on comptime-known values is now handled by the new namespace Sema.arith. Functions handling comptime arithmetic no longer live on Value; this is because some of them can emit compile errors, so some can't go on Value. Only semantic analysis should really be doing arithmetic on Values anyway, so it makes sense for it to integrate more tightly with Sema.

This commit also implements more coherent rules surrounding how undefined interacts with comptime and mixed-comptime-runtime arithmetic. The rules are as follows.

  • If an operation cannot trigger Illegal Behavior, and any operand is undefined, the result is undefined. This includes operations like 0 *| undef, where the LHS logically could be used to determine a defined result. This is partly to simplify the language, but mostly to permit codegen backends to represent undefined values as completely invalid states.

  • If an operation can trigger Illegal Behvaior, and any operand is undefined, then Illegal Behavior results. This occurs even if the operand in question isn't the one that "decides" illegal behavior; for instance, undef / 1 triggers IB. This is for the same reasons as described above.

  • An operation which would trigger Illegal Behavior, when evaluated at comptime, instead triggers a compile error. Additionally, if one operand is comptime-known undef, such that the other (runtime-known) operand isn't needed to determine that Illegal Behavior would occur, the compile error is triggered.

  • The only situation in which an operation with one comptime-known operand has a comptime-known result is if that operand is undefined, in which case the result is either undefined or a compile error per the above rules. This could potentially be loosened in future (for instance, 0 * rt could be comptime-known 0 with a runtime assertion that rt is not undefined), but at least for now, defining it more conservatively simplifies the language and allows us to easily change this in future if desired.

This commit fixes many bugs regarding the handling of undefined, particularly in vectors. Along with a collection of smaller tests, two very large test cases are added to check arithmetic on undefined.

The operations which have been rewritten in this PR are:

  • +, +%, +|, @addWithOverflow
  • -, -%, -|, @subWithOverflow
  • *, *%, *|, @mulWithOverflow
  • /, @divFloor, @divTrunc, @divExact
  • %, @rem, @mod

Other arithmetic operations are currently unchanged.

Resolves: #22743
Resolves: #22745
Resolves: #22748
Resolves: #22749
Resolves: #22914

@mlugg mlugg added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes. labels Mar 10, 2025
@mlugg mlugg force-pushed the comptime-arith-rewrite branch 3 times, most recently from 63d6140 to e5f0c5d Compare March 11, 2025 21:36
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could potentially be loosened in future (for instance, 0 * rt could be comptime-known 0 with a runtime assertion that rt is not undefined),

I suspect this will be a desirable change in the future, but, let's proceed with this at least for now.

Thanks for doing a thorough investigation and doing all this work to make the semantics consistent.

This commit reworks how Sema handles arithmetic on comptime-known
values, fixing many bugs in the process.

The general pattern is that arithmetic on comptime-known values is now
handled by the new namespace `Sema.arith`. Functions handling comptime
arithmetic no longer live on `Value`; this is because some of them can
emit compile errors, so some *can't* go on `Value`. Only semantic
analysis should really be doing arithmetic on `Value`s anyway, so it
makes sense for it to integrate more tightly with `Sema`.

This commit also implements more coherent rules surrounding how
`undefined` interacts with comptime and mixed-comptime-runtime
arithmetic. The rules are as follows.

* If an operation cannot trigger Illegal Behavior, and any operand is
  `undefined`, the result is `undefined`. This includes operations like
  `0 *| undef`, where the LHS logically *could* be used to determine a
  defined result. This is partly to simplify the language, but mostly to
  permit codegen backends to represent `undefined` values as completely
  invalid states.

* If an operation *can* trigger Illegal Behvaior, and any operand is
  `undefined`, then Illegal Behavior results. This occurs even if the
  operand in question isn't the one that "decides" illegal behavior; for
  instance, `undef / 1` is undefined. This is for the same reasons as
  described above.

* An operation which would trigger Illegal Behavior, when evaluated at
  comptime, instead triggers a compile error. Additionally, if one
  operand is comptime-known undef, such that the other (runtime-known)
  operand isn't needed to determine that Illegal Behavior would occur,
  the compile error is triggered.

* The only situation in which an operation with one comptime-known
  operand has a comptime-known result is if that operand is undefined,
  in which case the result is either undefined or a compile error per
  the above rules. This could potentially be loosened in future (for
  instance, `0 * rt` could be comptime-known 0 with a runtime assertion
  that `rt` is not undefined), but at least for now, defining it more
  conservatively simplifies the language and allows us to easily change
  this in future if desired.

This commit fixes many bugs regarding the handling of `undefined`,
particularly in vectors. Along with a collection of smaller tests, two
very large test cases are added to check arithmetic on `undefined`.

The operations which have been rewritten in this PR are:

* `+`, `+%`, `+|`, `@addWithOverflow`
* `-`, `-%`, `-|`, `@subWithOverflow`
* `*`, `*%`, `*|`, `@mulWithOverflow`
* `/`, `@divFloor`, `@divTrunc`, `@divExact`
* `%`, `@rem`, `@mod`

Other arithmetic operations are currently unchanged.

Resolves: ziglang#22743
Resolves: ziglang#22745
Resolves: ziglang#22748
Resolves: ziglang#22749
Resolves: ziglang#22914
@mlugg mlugg force-pushed the comptime-arith-rewrite branch from e5f0c5d to 42ea2f1 Compare March 16, 2025 03:03
@mlugg
Copy link
Member Author

mlugg commented Mar 16, 2025

Latest push was just to fix conflict; CI should pass.

@mlugg mlugg enabled auto-merge (rebase) March 16, 2025 03:03
@mlugg mlugg merged commit 2a4e06b into ziglang:master Mar 16, 2025
9 checks passed
) CompileError!Value {
const zcu = sema.pt.zcu;
if (lhs.isUndef(zcu)) return lhs;
if (lhs.isUndef(zcu)) return rhs;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is those supposed to be rhs on second line? @mlugg

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes it is, thank you. I'll open a follow-up soon to fix that typo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. release notes This PR should be mentioned in the release notes.
Projects
None yet
4 participants