-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add mechanism for spoofing inference work-limiting heuristics #24852
Conversation
CI timeout seems releated: Here is the backtrace from my worker: https://gist.github.com/iblis17/baf65932991124be3d9aff8760a747d2 |
730d046
to
e4e9ea7
Compare
@@ -362,7 +362,8 @@ | |||
'nothing | |||
(cons 'list (map car sparams))) | |||
,(if (null? loc) 0 (cadr loc)) | |||
(inert ,(if (null? loc) 'none (caddr loc)))))))) | |||
(inert ,(if (null? loc) 'none (caddr loc))) | |||
false))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've now added a Bool
flag to GeneratedFunctionStub
that inference can check to decide when to expand the generator. Just to get started, I'm hardcoding the flag to false
, but now I need to add a mechanism for actually setting it.
@JeffBezanson What would be the most reasonable way to add this setting mechanism? On the front end, I could add an extra argument to the @generated
macro, or add a @generated_early
macro, etc. I'm also wondering what the easiest/cleanest way is to propagate that setting to here. I guess I could add a meta node to the expression returned from the @generated
macro and then destructure it here?
...I'm a noob when it comes to Julia's parser 😛
Status update: This still has a bunch of loose odds and ends (mainly the flag-setting thing I asked Jeff about above). At this point, though, I'm waiting on some intermediate review to make sure I'm not on the wrong track. |
@vtjnash recommended offline that the flag-setting strategy should be for downstream users to manually construct and interpolate the |
I'm having trouble actually employing the whole "just interpolate the stub" approach suggested by @vtjnash. Take the following example: @generated function f(x::NTuple{N,Int}) where {N}
return Expr(:tuple, (:(x[$i] * x[$j]) for i in 1:N, j in 1:N)...)
end This essentially expands to function f(x::NTuple{N, Int}) where N
if $(Expr(:generated))
return Expr(:tuple, ((Core._expr)(:call, :*, (Core._expr)(:ref, :x, i), (Core._expr)(:ref, :x, j)) for i = 1:N, j = 1:N)...)
else
$(Expr(:meta, :generated_only))
return nothing
end
end The parser will then pick up on the I want to emit the equivalent thing as the above example, but manually construct the function _f(x::Type{NTuple{N,Int}}) where {N}
return Expr(:tuple, (:(x[$i] * x[$j]) for i in 1:N, j in 1:N)...)
end
@eval function f(x::NTuple{N,Int}) where {N}
$(Expr(:meta, :generated, Core.GeneratedFunctionStub(_f, Any[:f, :x], Any[:N], @__LINE__, Symbol(@__FILE__), false)))
end ...which spits out
...so I'm assuming my method body for |
The following seems to work (without the additional bool you've added, of course): function _f(sparam_N, fun, x::Type{NTuple{N,Int}}) where {N}
return Expr(:tuple, (:(x[$i] * x[$j]) for i in 1:N, j in 1:N)...)
end
@eval function f(x::NTuple{N,Int}) where {N}
$(Expr(:meta, :generated, Expr(:new, Core.GeneratedFunctionStub, :_f, Any[:f, :x], Any[:N], @__LINE__, QuoteNode(Symbol(@__FILE__)))))
end |
So I added some tests, most of which pass locally. The tests that actually inflate julia> f(i) = i
f (generic function with 1 method)
julia> m = methods(f, (Int,)).ms[1]
f(i) in Main at REPL[1]:1
julia> c = @code_lowered f(1)
CodeInfo(:(begin
nothing
return i
end))
julia> ccall(:jl_compress_ast, Any, (Any, Any), m, c); # works fine if `method_for_inference_heuristics` is uninflated
julia> c.method_for_inference_heuristics = m
f(i) in Main at REPL[1]:1
julia> ccall(:jl_compress_ast, Any, (Any, Any), m, c)
signal (11): Segmentation fault: 11
in expression starting at no file:0
module_in_worklist at /Users/jarrettrevels/data/repos/julia-dev/src/dump.c:206 [inlined]
jl_serialize_value_ at /Users/jarrettrevels/data/repos/julia-dev/src/dump.c:638
jl_compress_ast at /Users/jarrettrevels/data/repos/julia-dev/src/dump.c:2160
top-level scope at ./<missing> (unknown line)
jl_call_fptr_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:380 [inlined]
jl_call_method_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:399 [inlined]
jl_toplevel_eval_flex at /Users/jarrettrevels/data/repos/julia-dev/src/toplevel.c:721
jl_toplevel_eval_in at /Users/jarrettrevels/data/repos/julia-dev/src/builtins.c:626
eval at ./repl/REPL.jl:3
jl_call_fptr_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:380 [inlined]
jl_call_method_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:399 [inlined]
jl_apply_generic at /Users/jarrettrevels/data/repos/julia-dev/src/gf.c:2010
eval_user_input at ./repl/REPL.jl:69
jl_call_fptr_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:380 [inlined]
jl_call_method_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:399 [inlined]
jl_apply_generic at /Users/jarrettrevels/data/repos/julia-dev/src/gf.c:2010
macro expansion at ./repl/REPL.jl:100 [inlined]
#1 at ./event.jl:95
jl_call_fptr_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:380 [inlined]
jl_call_method_internal at /Users/jarrettrevels/data/repos/julia-dev/src/./julia_internal.h:399 [inlined]
jl_apply_generic at /Users/jarrettrevels/data/repos/julia-dev/src/gf.c:2010
jl_apply at /Users/jarrettrevels/data/repos/julia-dev/src/./julia.h:1476 [inlined]
start_task at /Users/jarrettrevels/data/repos/julia-dev/src/task.c:268
Allocations: 7940457 (Pool: 7939899; Big: 558); GC: 17
[1] 52261 segmentation fault ~/data/repos/julia-dev/julia It looks like the Naively, I can work around this by just storing the relevant type signature in c.method_for_inference_heuristics = Core.Inference.svec(typeof(f),Int) (I tried using Is this workaround valid @vtjnash? I'll push forward assuming so, but let me know if not. |
Most sorts of objects can't be serialized. You have to instead push it into the |
Ah, make sense. So far, it seems to me like the signature-based approach is cleaner/more lightweight than the method-based approach anyway, so I'll just stick with that unless it's problematic. |
Argh, looks like This time, the breakage causes inference bootstrap to fail - when calling It looks like the (un)compression methods should be (de)serializing EDIT: Fixed now, see my comment in |
src/jltypes.c
Outdated
"code", | ||
"signature_for_inference_heuristics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shout out to @maleadt for pointing out the missing comma here that was causing my bootstrap problems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
been there, buddy
Got un-killable process on BSD CI worker again: (lldb) p jl_lineno
(int) $0 = 1004
(lldb) p jl_filename
(const char *) $1 = 0x000000080fd666b8 "/usr/home/julia/julia-fbsd-buildbot/worker/11rel-amd64/build/test/inference.jl" Here is the backtrace from lldb |
Thanks Iblis! I'll look into that. I've got Cassette working with this change, and it does seem to help in some places, but it doesn't immediately render overdubbed functions inferable. AFAICT, this just means that this PR doesn't apply the spoofing mechanism in all the places where it's actually needed, but exploring that problem space can be done after this is merged. Luckily, the spoofing mechanism should make it much easier for inference-knowledgable folks (cough @vtjnash cough) to identify the problem areas, since one can now directly access the underlying signature wherever it is required. Before merging, it'd be useful to have a test that demonstrates this feature the way that Cassette is using it, but I'm not sure how to accomplish that without basically reimplementing Cassette in Base... |
Seems like the way these new mechanisms are being utilized is incorrect, because it's causing the inference test here to fail. The actual number of specializations on this branch: julia> map(count_specializations, methods(copy_dims_out))
3-element Array{Int64,1}:
156
109
109 @vtjnash can I get your review on my |
A friendly New Year's bump 😛 I'd really like to get this working ASAP so we can start tackling whatever else comes up in Cassette-land. |
…_heuristics mechanism [ci skip]
Since I'm having trouble getting review here after multiple requests, I've decided to take a different approach and will break up this PR into two parts: the spoofing mechanism implementation, and actually employing the mechanism in inference. The former will now be the sole subject of this PR, while the latter will be the subject of a follow-up PR. I hope this will make the work more amenable to review. Since the spoofing mechanism itself has been complete for a while now, I plan on merging this today assuming nothing catastrophic happens. |
Is there any documentation anywhere on how to use this mechanism? I'm not quite able to follow what Cassette.jl does with it. I have a package (https://github.com/marius311/SelfFunctions.jl) which does one thing similar to Cassette.jl, basically involving a macro that wraps every single function call in another function call, and unsurprisingly this breaks inference in many cases, and my understanding is the mechanism in this PR / used by Cassette.jl could make it work. (of course maybe I could just use Cassette.jl, which I may look at further down the line) Thanks for any help. |
I'm not sure what's going on here, but just based on what your package seems to be doing from the README, my guess is that this PR is irrelevant to whatever problem you're running into...
No, it's not really a generally exposed interface; it is very easy to abuse and shouldn't generally be used unless you are writing some new extension to the compiler itself. That being said, this PR just adds two new fields: The first is the new The second is the new
Cassette uses this to ensure that compiler does not think overdubbed code is deeply recursing for the sake of limiting inference. In other words, Cassette wants the compiler to treat the call chain To that end, Cassette sets |
Thanks, that helps. I should have been more specific about what aspect of SelfFunctions.jl I think this might be relevant. To explain: One thing which that macro does is make it so if you call "self" functions from inside other self functions, you don't need explicitly pass the "self" argument. The way this is done is e.g. in that second example @self Rosenbrock shifted_rosen(x,y) = rosen(x+1, y+1) is rewritten to shifted_rosen(self,x,y) = self_call(self, rosen, selfcall(self,+,x,1), selfcall(self,+,y,1)) and then dispatch makes it so that selfcall(self, normalfunction, args...) = normalfunction(args...) # this dispatches for `+` above
selfcall(self, selffunction, args...) = selffunction(self, args...) # this dispatches for `rosen` Doing it this way works even if the inner functions haven't been defined yet (as it should), and preserves type stability. Except that now inference sees tons of recursion into this I'm not too familiar with how Cassette.jl works but I understood there's alot of this same recursion (the function is One difference I'm realizing now is that |
Hmm...naively I'm not sure why one would use the Anyway, it might be better to move this discussion to an issue in the SelfFunctions.jl repo. |
ref #24676
method_for_inference_limit_heuristics
field to CodeInfoWith what I've got so far, everything seems fine as long as the
method_for_inference_heuristics
is never inflated. Returning aCodeInfo
with an inflatedmethod_for_inference_heuristics
field, however, seems to result in a segfault, so I'm pretty sure I've missed some places insrc
that still need updating. @Keno Any tips here?