Skip to content

Commit

Permalink
Refactor CodeInfo/CodeInstance separation and interfaces
Browse files Browse the repository at this point in the history
The `CodeInfo` type is one of the oldest types in the system and
has grown a bit of cruft. In particular, the `rettype`, `parent`,
`edges`, `min_world`, `max_world` fields are not used for the
original purpose of representing code, but for one or more of
(in decreasing order of badness):

1. Smuggling extra results from inference into the compiler
2. Sumggling extra arguments into OpaqueClosure constructors
3. Passing extra information from generated functions to inference

The first of these points in particular causes a fair bit of mixup
between caching concerns and compiler concerns and results in
external abstract interpreters maintainging their own dummy CodeInfos,
just to comply with the interface. Originally, I just wanted to
clean up that bit, but it didn't really make sense as a standalone
piece, so this PR is more comprehensive.

In particular, this PR:

1. Removes the `parent` and `rettype` fields of `CodeInfo`. They are largely
   vestigal and code accessing these is probably doing the wrong thing. They
   should instead be looking at either the CodeInstance or remembering the
   query that was asked of the cache in the first place.

2. Makes `edges`, `min_world` and `max_world` used for generated functions only.
   All other uses were replaced by appropriate queries on the CodeInstance. In particular,
   inference no longer sets these. In the future we may want to consider removing
   these also and having generated functions return some other object, but that is a
   topic to revisit once the broader compiler plugins landscape is more clear.

3. Makes the external type inference interface return `CodeInstance` rather than
   `CodeInfo`. This results in a lot of cleanup, because many functions had multiple
   code paths, some for CodeInstance and others for fallback to inference/CodeInfo. This
   is all cleaned up now. If you don't have a CodeInstance, you can ask inference for one.
   This CodeInstance may or may not be in the cache, but you can look at its types, compile it,
   etc.

4. Moves the main inference entrypoint out of the codegen library. There is still a little
   bit of entangelement, but this makes codegen much more of an independent system that
   you give a CodeInstance and it just fills in the invoke pointer for.

The overall theme here is decoupling. Over time, various parties have wanted to
use the julia compiler with custom IR datastructure, backend code generators, caches, etc.
This doesn't quite get us all the way there, but makes inference and codegen much more
independent with a clear IR-format-independent interface (CodeInstance).
  • Loading branch information
Keno committed Feb 6, 2024
1 parent a26bd7f commit 9371cb3
Show file tree
Hide file tree
Showing 35 changed files with 489 additions and 549 deletions.
14 changes: 11 additions & 3 deletions base/compiler/inferencestate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ mutable struct InferenceState
dont_work_on_me = false
parent = nothing

valid_worlds = WorldRange(src.min_world, src.max_world == typemax(UInt) ? get_world_counter() : src.max_world)
valid_worlds = WorldRange(1, get_world_counter())
bestguess = Bottom
exc_bestguess = Bottom
ipo_effects = EFFECTS_TOTAL
Expand All @@ -335,13 +335,21 @@ mutable struct InferenceState
InferenceParams(interp).unoptimize_throw_blocks && mark_throw_blocks!(src, handler_at)
!iszero(cache_mode & CACHE_MODE_LOCAL) && push!(get_inference_cache(interp), result)

return new(
this = new(
linfo, world, mod, sptypes, slottypes, src, cfg, method_info,
currbb, currpc, ip, handlers, handler_at, ssavalue_uses, bb_vartables, ssavaluetypes, stmt_edges, stmt_info,
pclimitations, limitations, cycle_backedges, callers_in_cycle, dont_work_on_me, parent,
result, unreachable, valid_worlds, bestguess, exc_bestguess, ipo_effects,
restrict_abstract_call_sites, cache_mode, insert_coverage,
interp)

# Apply generated function restrictions
if src.min_world != 1 || src.max_world != typemax(UInt)
# From generated functions
this.valid_worlds = WorldRange(src.min_world, src.max_world)
end

return this
end
end

Expand Down Expand Up @@ -796,7 +804,7 @@ function IRInterpretationState(interp::AbstractInterpreter,
method_info = MethodInfo(src)
ir = inflate_ir(src, mi)
return IRInterpretationState(interp, method_info, ir, mi, argtypes, world,
src.min_world, src.max_world)
code.min_world, code.max_world)
end

# AbsIntState
Expand Down
9 changes: 2 additions & 7 deletions base/compiler/optimize.jl
Original file line number Diff line number Diff line change
Expand Up @@ -107,19 +107,17 @@ is_declared_noinline(@nospecialize src::MaybeCompressed) =
# OptimizationState #
#####################

is_source_inferred(@nospecialize src::MaybeCompressed) =
ccall(:jl_ir_flag_inferred, Bool, (Any,), src)

function inlining_policy(interp::AbstractInterpreter,
@nospecialize(src), @nospecialize(info::CallInfo), stmt_flag::UInt32)
if isa(src, MaybeCompressed)
is_source_inferred(src) || return nothing
src_inlineable = is_stmt_inline(stmt_flag) || is_inlineable(src)
return src_inlineable ? src : nothing
elseif isa(src, IRCode)
return src
elseif isa(src, SemiConcreteResult)
return src
elseif isa(src, CodeInstance)
return inlining_policy(interp, src.inferred, info, stmt_flag)
end
return nothing
end
Expand Down Expand Up @@ -222,7 +220,6 @@ end
function ir_to_codeinf!(src::CodeInfo, ir::IRCode)
replace_code_newstyle!(src, ir)
widen_all_consts!(src)
src.inferred = true
return src
end

Expand All @@ -240,8 +237,6 @@ function widen_all_consts!(src::CodeInfo)
end
end

src.rettype = widenconst(src.rettype)

return src
end

Expand Down
2 changes: 0 additions & 2 deletions base/compiler/ssair/legacy.jl
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,6 @@ Mainly used for testing or interactive use.
inflate_ir(ci::CodeInfo, linfo::MethodInstance) = inflate_ir!(copy(ci), linfo)
inflate_ir(ci::CodeInfo, sptypes::Vector{VarState}, argtypes::Vector{Any}) = inflate_ir!(copy(ci), sptypes, argtypes)
function inflate_ir(ci::CodeInfo)
parent = ci.parent
isa(parent, MethodInstance) && return inflate_ir(ci, parent)
# XXX the length of `ci.slotflags` may be different from the actual number of call
# arguments, but we really don't know that information in this case
argtypes = Any[ Any for i = 1:length(ci.slotflags) ]
Expand Down
Loading

0 comments on commit 9371cb3

Please sign in to comment.