-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
[JuliaLowering] Refactor scope resolution pass #60316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
912ff3c to
47eb8e9
Compare
topolarity
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to do another once-over of the core pass logic, but this should be useful (mostly very minor) feedback in the mean time.
|
|
||
| # Compute fields for a closure type, one field for each captured variable. | ||
| function closure_type_fields(ctx, srcref, closure_binds, is_opaque) | ||
| capture_ids = Vector{IdTag}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another micro-optimization, but Vector{IdTag} should probably be BitSet
(since these are small and contiguous)
| # Every lexical scope, indexed by ScopeId | ||
| scopes::Vector{ScopeInfo} | ||
| # Current stack of scopes to look for names in, innermost scope last | ||
| scope_stack::Vector{ScopeId} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this identical to the sequence of scopes you get if you follow the parent(ctx, scope) chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(discussed in meeting) I'm mirroring our handling of macro scope layers but agree doing parent would be cleaner.
topolarity
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the right approach (and indeed closer to flisp) to update shared bindings across all lambdas simultaneously.
Changes are looking good to me. Nice work @mlechu!
Feel free to take or drop the nits / ideas, as desired.
"get" is clearer to me that the binding is the output rather than the input,
that it definitely exists, and that the ID resolves uniquely
I know it's a goal to make some of these clearer, but using the same string as
Expr (and using the fall-through case in conversion) will make some planned
code movement easier
47eb8e9 to
64c5235
Compare
Co-authored-by: Cody Tapscott <topolarity@tapscott.me>
64c5235 to
903568a
Compare
| @testset "sparam,local" begin | ||
| s = "function (a::s) where {s}; local s; end" | ||
| @test_throws LoweringError JuliaLowering.include_string(test_mod, s) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already test these cases more comprehensively in scopes_ir.jl (it's more comprehensive because we also test the error messages there)
So I think we can remove these. Generally my rule of thumb has been to avoid @test_throws LoweringError because the IR tests give a convenient way of asserting that the error message has the correct provenance.
Side note - I think it would be nice to bring our IR tests and eval() tests closer together with a bit of test infrastructure refactoring at some point. The aim being twofold:
- Reduce confusion and duplication in cases like this
- Reduce duplication for cases where we want to both test the structure of the IR and the value that it
eval()'s to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would also help if we simply added a bit more structure to the IR tests. Eg, allow structuring the IR test files so that we can group the content into test sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think we can remove these.
It feels like more effort to selectively delete cases from an "every combination" testset than to just have the full matrix there.
Side note - I think it would be nice to bring our IR tests and eval() tests closer together with a bit of test infrastructure refactoring at some point
The split we have now is nice in that IR tests don't need to run (no setting up globals) and eval() tests don't need to be limited to readable amounts of IR. My main gripe with our tests isn't duplication, it's that we need to write more of them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main gripe with our tests isn't duplication, it's that we need to write more of them :)
Haha! It's the bane of any new project when switching from "this is going fast and everything is changing all the time" to "oh this about to be production code".
Having done several big new development projects, there's always this transition period, and it's critical to the longer term success. So I really appreciate your efforts - and likewise I'll try to do my best to beef up tests whenever I see any missing.
If I find a way to make the tests "more fun" to write I'll also put in the effort there. I am happy that we have IR tests at all - in contrast to existing lowering. They're still a bit clunky to work with - perhaps we may yet find a way to improve that.
|
|
||
| # globals may overlap args or sparams (buggy?) TODO: decide whether it's | ||
| # worth replicating this behaviour. We would likely need to copy the way | ||
| # flisp nests an extra scope block in every lambda. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eek. I can't recall right now whether I intentionally tried to depart from the weird flisp behavior or not. seems potentially breaking to be different (even though it's consistent) - may need to be changed via syntax evolution I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eek indeed. I'd be willing to call it a julia bug given how strange it would be to call it part of the language, but we can see what pkgeval says. I've seen someone ask if decl keywords do something to an existing variable, which is confusion to prevent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable.
This distinction can be important as soon as we start work on unboxing captures, so yes I did this on purpose and at first glance I don't like removing this infrastructure - even though it's clearly incomplete at this stage.
Can you point me to examples of the inaccuracies? Obviously if they affect semantics of evaluation this is a problem to fix! But if they're internal differences you should expect that we'll be diverging a lot from flisp once we properly start on unboxing. |
|
As another more concrete example of why This is metadata which applies to how argument slots are used within the body of the function f(g)
function h()
g() # should not set `is_called` on `g`'s slot flag in `f`
end
endYou're certainly right to notice a disconnect with flisp - because I didn't actually port the flisp implementation of |
I don't know if I agree with that - Calling |
I'll elaborate on my claim "we need something more complex anyway". Almost every lambda-vinfo-flags-based unboxing proof I can think of requires other machinery complex enough to make the flags unnecessary. Right now (ignoring flisp's The one case the flags give us anything new is when the inner lambda is never called, but proving this in anything other than a trivial case requires basic-block-level or maybe even statement-level analysis. But with basic-block-level analysis, why are we tracking one "assume everything executes" flag per inner lambda in the first place? |
Hmm. I was convinced it would be the job of the optimizer to deal with inlining the implementation of struct _h_type{G}
g::G
end
function (h::_h_type)()
h.g()
end
function f(g)
h = _h_type(g)
h() # lowering cannot know that g is called
endBut perhaps I'm mistaken and this is a case where the compiler can use the fact that We also have the following example where the heuristic guesses wrong because the closure isn't called. But I'll concede it's a less likely case and it's high probability that any closure will be called during the lifetime of the enclosing function. function f(g)
function h()
g()
end
return h
end |
|
Basically - I admit I thought this was an outright bug which I was fixing but it's not actually simple 😅 |
I agree that information global to the body of a lambda isn't very powerful. It can help, but the cases it helps with are rather trivial - and I don't think very useful. We need some simple data flow analysis to do unboxig as best we can within lowering. And "as best we can" is not very powerful because our escape analysis will be extremely pessimistic without information from the optimizer. Ok, so I'm a bit happier with deleting this stuff for now, assuming that we may need to completely refactor it in the next step anyway. Would still like to know whether the |
|
I'm going to merge this as-is I am a fan of @c42f 's ambitions to improve the specialization heuristics / tighten up the flags here. Nonetheless, I think we'll want to save that for "Phase 2" of the JuliaLowering initiative, after we have already gotten parity with flisp 👍 This is definitely an improvement in terms of bugs / parity, so let's get it in! |
I think that's pretty fair in this case - even if it's a bug (I'm now unsure about this) - keeping consistency with flisp is reasonable because changing specialization could have subtle side effects and need its own review. Side note - there's several other probable-bugs which I came up against in working through the flisp code for the lowering port as a whole. I mostly chose to fix them when they seemed obviously buggy so we'll be able to close a few existing issues once the integration work is done. Obviously we could reimplement the bugs but that's a bit depressing so hopefully we'll be able to get away with just leaving them fixed 😅 |
Fixes
Soft scope
We currently treat flisp's
'(block (softscope true))as a scope that "issoft", but this isn't correct. It should behave more like a toggle, where if the
scope surrounding
(softscope true)is the top-level thunk, neutral scopes notprotected by a hard scope become permeable to already-defined globals. I've
added
K"softscope"for this.Fixes JuliaLang/JuliaLowering.jl#101.
JuliaLowering.activate!()should be much more usable in the REPL now, asglobals won't be accidentally eaten by the "soft scope."
Shadowing behaviour
Found while cleaning up the lhs-resolution step with the change above. This PR
allows static parameters to be shadowed by globals and locals as long as they're
explicit (and not in the same scope). flisp allows this with globals, and the
explicit locals that desugar to
local-defforms (which JuliaLowering doesn'thave).
This change is more permissive than flisp in the local case, since after looking
into why shadowing was disallowed I realized it was just was just to prevent
assignment to the static parameter (#32623). The flisp fix leads to some funny
behaviour:
Bindings/LambdaBindings
In figuring out how to use the variable bookkeeping system, I ran into inaccuracies.
LambdaBindingsis currently a per-lambda map from unique variable to fourflags:
captured,read,assigned, andcalled. I think (but correct me ifI'm wrong @c42f) this was a misinterpretation of the holy text: in flisp
lowering, a local variable captured by some other lambda does show up in both
lambdas' variable lists, but is the same underlying object, and flag mutations
on one variable are seen by all lambdas.
I tried to think of other reasons for tracking vinfo per lambda within lowering,
but if we're doing something about the capture-boxing issue, we need something
more complex anyway.
This PR moves all vinfo to
BindingInfoand deletes the incorrect bookkeepingin
LambdaBindings. We still need to have a per-lambda flag for capturedness(different from the variable-level
captvinfo flag). With the added flags,I've just made BindingInfo mutable since our previous workflow (BindingId is an
index into a vector; mutate the vector) doesn't give us the benefits of
immutability anyway.
Enhancements
answer questions like "what names are available at my cursor?" Recreating
this previously-discarded information was a bit hacky! [1] [2]
TODO
K"local"in desugaring is dubious in some placesexpr_compat_modeto the scope analysis context, but we still needto implement flisp hygiene exemptions for globals (see note in test/scopes.jl).
#self#argument still has extra flags set insome cases. This is an existing desugaring bug with a comment that took me
too long to find: we shouldnn't be using the same
#self#binding formultiple methods defined by one function body