Skip to content

Conversation

@maleadt
Copy link
Member

@maleadt maleadt commented Feb 23, 2023

@maleadt
Copy link
Member Author

maleadt commented Feb 23, 2023

@vtjnash in addition to #48611 (comment), I have another question. I'm looking at updating CassetteOverlay.jl, but am observing the generator being called a second time with world=1 and Base.get_world_counter()=-1. Is world set to 1 when the generator is called at run time, and Base.get_world_counter()=-1 to avoid relying on functionality that defaults to the curren tworld?

@maleadt maleadt added the re-land This relands a PR that was previously merged but was later reverted. label Feb 23, 2023
@maleadt

This comment was marked as outdated.

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected.
A full report can be found here.

@maleadt maleadt marked this pull request as draft February 24, 2023 08:48
@maleadt maleadt force-pushed the tb/reland_generated_worlds branch from 49e7aec to 553a651 Compare February 24, 2023 14:57
@oscardssmith
Copy link
Member

One thing that you might want to check that package eval won't catch is Diffractor which isn't registered but is vulnerable to changes here.

vtjnash and others added 5 commits March 22, 2023 14:33
Expose the demanded world to the GeneratedFunctionStub caller, for users
such as Cassette. If this argument is used, the user must return a
CodeInfo with the min/max world field set correctly.

Make the internal representation a tiny bit more compact also, removing
a little bit of unnecessary metadata.

Remove support for returning `body isa CodeInfo` via this wrapper, since
it is impossible to return a correct object via the
GeneratedFunctionStub since it strips off the world argument, which is
required for it to do so. This also removes support for not inferring
these fully (expand_early=false).

Also answer method lookup queries about the future correctly, by
refusing to answer them. This helps keeps execution correct as methods
get added to the system asynchronously.

This reverts "fix #25678: return matters for generated functions
(#40778)" (commit 92c84bf), since this
is no longer sensible to return here anyways, so it is no longer
permitted or supported by this macro.

Fixes various issues where we failed to specify the correct world.
@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Mar 22, 2023
@vtjnash vtjnash marked this pull request as ready for review March 22, 2023 18:36
@vtjnash vtjnash force-pushed the tb/reland_generated_worlds branch from 553a651 to 7add085 Compare March 22, 2023 18:36
@maleadt
Copy link
Member Author

maleadt commented Mar 24, 2023

As a proposal to make this PR less breaking, can't we expose the caller's world to generated functions by calling them in that world, i.e., to make Base.get_world_counter() return the caller's world instead of the current sentinel value of typemax(UInt)? That would (1) make existing code just work, as calling e.g. which from a generator (whose world argument defaults to Base.get_world_counter()) would work as expected, and (2) make it much easier to do the correct thing, without requiring a manual GeneratedFunctionStub.

@vtjnash
Copy link
Member

vtjnash commented Mar 24, 2023

Then the generator itself would be running in an unpredictable and unstable world, which is what the world fixing is intended to alleviate.

@vtjnash vtjnash merged commit dc3953d into master Mar 24, 2023
@vtjnash vtjnash deleted the tb/reland_generated_worlds branch March 24, 2023 13:56
@maleadt
Copy link
Member Author

maleadt commented Mar 24, 2023

Then maybe we could introduce a Base.get_target_world() that defaults to Base.get_world_counter() except for when we're running in a generator, and have reflection methods that take world arguments default to that? Just trying to make this PR less breaking, as I don't feel like combing through 200+ failures.

EDIT: ok, guess not.

@vtjnash
Copy link
Member

vtjnash commented Mar 24, 2023

I added a quick note about this to the commit message also, but note that the sentinel value is an expected value to see passed here in rare cases–even though, and indeed because, it is not legal to run in that world (per your fix in adc5bea):

Note that the passed world may be typemax(UInt), which demands that
the generator must return code that is unspecialized (guaranteed to run
correctly in any world).

@vtjnash
Copy link
Member

vtjnash commented Mar 24, 2023

The problem is the simple API is incorrect for those cases, since worlds are not simply about the input value, but about the requirement to preserve and sync the output metadata (min-world, max-world, and edges). The reflection APIs currently couple all of those together so that usually you have forced to provide some attempt at handling all of that if you are using any of it.

@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Mar 28, 2023
aviatesk added a commit to aviatesk/Revise.jl that referenced this pull request Mar 29, 2023
After JuliaLang/julia#48766, `typemax(UInt)` is no longer valid argument
for the `Base._methods_by_ftype` query.
timholy pushed a commit to timholy/Revise.jl that referenced this pull request Mar 29, 2023
After JuliaLang/julia#48766, `typemax(UInt)` is no longer valid argument
for the `Base._methods_by_ftype` query.
gbaraldi added a commit to gbaraldi/julia that referenced this pull request Mar 30, 2023
vtjnash added a commit to vtjnash/julia that referenced this pull request Mar 31, 2023
vtjnash added a commit to vtjnash/julia that referenced this pull request Mar 31, 2023
gbaraldi added a commit to gbaraldi/julia that referenced this pull request Apr 3, 2023
vtjnash added a commit that referenced this pull request Apr 5, 2023
Xnartharax pushed a commit to Xnartharax/julia that referenced this pull request Apr 19, 2023
Expose the demanded world to the GeneratedFunctionStub caller, for users
such as Cassette. If this argument is used, the user must return a
CodeInfo with the min/max world field set correctly.

Make the internal representation a tiny bit more compact also, removing
a little bit of unnecessary metadata.

Remove support for returning `body isa CodeInfo` via this wrapper, since
it is impossible to return a correct object via the
GeneratedFunctionStub since it strips off the world argument, which is
required for it to do so. This also removes support for not inferring
these fully (expand_early=false).

Also answer method lookup queries about the future correctly, by
refusing to answer them. This helps keeps execution correct as methods
get added to the system asynchronously.

This reverts "fix JuliaLang#25678: return matters for generated functions
(JuliaLang#40778)" (commit 92c84bf), since this
is no longer sensible to return here anyways, so it is no longer
permitted or supported by this macro.

Fixes various issues where we failed to specify the correct world.

Note that the passed world may be `typemax(UInt)`, which demands that
the generator must return code that is unspecialized (guaranteed to run
correctly in any world).

Co-authored-by: Jameson Nash <vtjnash@gmail.com>
Keno added a commit that referenced this pull request Jun 29, 2023
This fixes #49715. The fix itself is pretty simple - just remove
the generator expansion that was added in #48766, but the bigger
question here is what the correct behavior should be in the first place.

# Dynamic Semantics, generally

The primary question here are of the semantics of generated functions.
Note that this is quite different to how they are implemented. In
general, the way we think about compiling Julia is that there is a
well defined set of *dynamic semantics* that specify what a particular
piece of Julia code means. Julia's dynamic semantics are generally quite
simple (at every point, call the most specific applicable method).
What happens under the hood may be quite different (e.g. lots of inference,
compiling constant folding, etc), but the compilation process should
mostly preserve the semantics (with a few well defined exceptions
around floating point arithmetic, effect assumptions, semantically
unobservable side effects, etc.).

# The dnymaic semantics of generated functions

With that diatribe out of the way, let's think about the dynamic semantics
of generated functions. We haven't always been particularly clear about
this, but I propose it's basically the following:

For a generated function:
```
@generated function f(args...)
    # = generator body =#
end
```

this is semantically equivalent to the function to basically the following:

```
const lno = LineNumberNode(@__FILE__, @__LINE__); function f(args...)
    generator = @opaque @assume_effects :foldable :generator (args...)->#= generator body =#
    body = generator(Base.get_world_counter(), lno, Core.Typeof.(args))
    execute(body, f, args...)
end
```

A couple of notes on this:

1. `@opaque` used here for the world-age capture semantics of the generator itself
2. There's an effects-assumption `:generator` that doesn't exist but is
   supposed to capture the special allowance for calling generators. This
   is discussed more below.

## Implementing `execute`

For a long time, we didn't really have a first-class implementation of `execute`.
It's almost (some liberties around the way that the arguments work, but you get
the idea)

```
execute_eval(body, f, args...) = eval((args...)->$body)(f, args....)
```

but that doesn't have the correct world age semantics (would error
as written and even if you used invokelatest, the body would run
in the wrong world).

However, with OpaqueClosure we do actually have a mechanism now and
we could write:

```
execute(body, f, args...) = OpaqueClosure(body, f)(args...)
```

Again, I'm not proposing this as an implementation, just to give us an idea
of what the dynamic semantics of generated functions are.

# The particular bug (#49715)

The issue in #49715 is that the following happens:
1. A generated function gets called and inference is attempted.
2. Inference attempts to infer the generated function and call the generator.
3. The generator throws an error.
4. Inference fails.
5. The compiler enters a generic inference-failure fallback path
6. The compiler asks for a generator expansion in the generic world (-1)
7. This gives a different error, confusing the user.

There is the additional problem that this error gets thrown at
compilation time, which is not technically legal (and there was an
existing TODO to fix that).

In addition to that, I think there is a separate question of whether it
should be semantically legal to throw an error for a different world age
than the currently running one. Given the semantics proposed above, I
would suggest that the answer should be no. This does depend on the
exact semantics of :generator, but in general, our existing
effects-related notions do not allow particularly strong assumptions on
the particular error being thrown (requiring them to be re-evaluated
at runtime), and I see no reason to depart from this practice here.

Thus, I would suggest that the current behavior should be disallowed
and the expected behavior is that the generic fallback implementation
of generated functions invoke the generator in the runtime world and
expose the appropriate error.

# Should we keep the generic world?

That does leave the question what to do about the generic world (-1).
I'm not 100% convinced that this is necessarily a useful concept to
have. It is true that most generated functions do not depend on the
world age, but they can already indicate this by returning a value
with bounded world range and no backedges (equivalently returning
a plain expression). On the other hand, keeping the generic world
does risk creating the inverse of the situation that prompted this
issue, in that there is no semantically reachable path to calling
the generator with the generic world, making it hard to debug.

As a result, I am very strongly leaning towards removing this concept,
but I am open to being convinced otherwise.

# This PR

This PR, which is considerably shorter than this commit message is very
simple: The attempt to invoke the generator with the generic world -1
is removed. Instead, we fall back to the interpreter, which already
has the precise semantics that I want here - invoking the generator
in the dynamic world and interpreting the result.

# The semantics of :generator

That leaves one issue to be resolved which is the semantics of `:generator`.
I don't think it's necessary to be as precise here as we are about the
other effects we expose, but I propose it be something like the following:

For functions with the :generator effects assumption, :consistent-cy is
relaxed as follows:

1. The requistive notion of equality is relaxed to a "same code and
   metadata" equality of code instances. I don't think we have any
   predicate for this (and it's not necessarily computable), but the
   idea should be that the CodeInstance is always computed in the exact
   same way, but may be mutable and such. Note that this is explicitly
   not functional extensionality, because we do analyze the structure of
   the returned code and codegen based on it.

2. The world-age semantics of :consistent sharpened to require
   our relaxed notion of consistency for any overlapping min_world:max_world
   range returned from the generator.
vtjnash pushed a commit that referenced this pull request Jun 29, 2023
This fixes #49715. The fix itself is pretty simple - just remove
the generator expansion that was added in #48766, but the bigger
question here is what the correct behavior should be in the first place.

# Dynamic Semantics, generally

The primary question here are of the semantics of generated functions.
Note that this is quite different to how they are implemented. In
general, the way we think about compiling Julia is that there is a
well defined set of *dynamic semantics* that specify what a particular
piece of Julia code means. Julia's dynamic semantics are generally quite
simple (at every point, call the most specific applicable method).
What happens under the hood may be quite different (e.g. lots of inference,
compiling constant folding, etc), but the compilation process should
mostly preserve the semantics (with a few well defined exceptions
around floating point arithmetic, effect assumptions, semantically
unobservable side effects, etc.).

# The dnymaic semantics of generated functions

With that diatribe out of the way, let's think about the dynamic semantics
of generated functions. We haven't always been particularly clear about
this, but I propose it's basically the following:

For a generated function:
```
@generated function f(args...)
    # = generator body =#
end
```

this is semantically equivalent to the function to basically the following:

```
const lno = LineNumberNode(@__FILE__, @__LINE__); function f(args...)
    generator = @opaque @assume_effects :foldable :generator (args...)->#= generator body =#
    body = generator(Base.get_world_counter(), lno, Core.Typeof.(args))
    execute(body, f, args...)
end
```

A couple of notes on this:

1. `@opaque` used here for the world-age capture semantics of the generator itself
2. There's an effects-assumption `:generator` that doesn't exist but is
   supposed to capture the special allowance for calling generators. This
   is discussed more below.

## Implementing `execute`

For a long time, we didn't really have a first-class implementation of `execute`.
It's almost (some liberties around the way that the arguments work, but you get
the idea)

```
execute_eval(body, f, args...) = eval((args...)->$body)(f, args....)
```

but that doesn't have the correct world age semantics (would error
as written and even if you used invokelatest, the body would run
in the wrong world).

However, with OpaqueClosure we do actually have a mechanism now and
we could write:

```
execute(body, f, args...) = OpaqueClosure(body, f)(args...)
```

Again, I'm not proposing this as an implementation, just to give us an idea
of what the dynamic semantics of generated functions are.

# The particular bug (#49715)

The issue in #49715 is that the following happens:
1. A generated function gets called and inference is attempted.
2. Inference attempts to infer the generated function and call the generator.
3. The generator throws an error.
4. Inference fails.
5. The compiler enters a generic inference-failure fallback path
6. The compiler asks for a generator expansion in the generic world (-1)
7. This gives a different error, confusing the user.

There is the additional problem that this error gets thrown at
compilation time, which is not technically legal (and there was an
existing TODO to fix that).

In addition to that, I think there is a separate question of whether it
should be semantically legal to throw an error for a different world age
than the currently running one. Given the semantics proposed above, I
would suggest that the answer should be no. This does depend on the
exact semantics of :generator, but in general, our existing
effects-related notions do not allow particularly strong assumptions on
the particular error being thrown (requiring them to be re-evaluated
at runtime), and I see no reason to depart from this practice here.

Thus, I would suggest that the current behavior should be disallowed
and the expected behavior is that the generic fallback implementation
of generated functions invoke the generator in the runtime world and
expose the appropriate error.

# Should we keep the generic world?

That does leave the question what to do about the generic world (-1).
I'm not 100% convinced that this is necessarily a useful concept to
have. It is true that most generated functions do not depend on the
world age, but they can already indicate this by returning a value
with bounded world range and no backedges (equivalently returning
a plain expression). On the other hand, keeping the generic world
does risk creating the inverse of the situation that prompted this
issue, in that there is no semantically reachable path to calling
the generator with the generic world, making it hard to debug.

As a result, I am very strongly leaning towards removing this concept,
but I am open to being convinced otherwise.

# This PR

This PR, which is considerably shorter than this commit message is very
simple: The attempt to invoke the generator with the generic world -1
is removed. Instead, we fall back to the interpreter, which already
has the precise semantics that I want here - invoking the generator
in the dynamic world and interpreting the result.

# The semantics of :generator

That leaves one issue to be resolved which is the semantics of `:generator`.
I don't think it's necessary to be as precise here as we are about the
other effects we expose, but I propose it be something like the following:

For functions with the :generator effects assumption, :consistent-cy is
relaxed as follows:

1. The requistive notion of equality is relaxed to a "same code and
   metadata" equality of code instances. I don't think we have any
   predicate for this (and it's not necessarily computable), but the
   idea should be that the CodeInstance is always computed in the exact
   same way, but may be mutable and such. Note that this is explicitly
   not functional extensionality, because we do analyze the structure of
   the returned code and codegen based on it.

2. The world-age semantics of :consistent sharpened to require
   our relaxed notion of consistency for any overlapping min_world:max_world
   range returned from the generator.

Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
oscardssmith added a commit to oscardssmith/Cassette.jl that referenced this pull request Aug 7, 2023
JuliaLang/julia#48766 makes it so this function takes in a world age.
vtjnash added a commit that referenced this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

re-land This relands a PR that was previously merged but was later reverted.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants