Closed
Description
During runtime generation, we emit an unoptimized bitcode library for every runtime function and link those together. On 1.11, that means linking many modules with the following IR (post JuliaLang/julia#50632):
; ModuleID = 'start'
source_filename = "start"
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32-ni:10:11:12:13"
target triple = "air64-apple-macosx13.4.1"
@"*Core.Intrinsics.bitcast#1" = hidden local_unnamed_addr global {}* null, !julia.constgv !0
define i64 @ijl_unbox_uint64({} addrspace(10)* noundef nonnull readonly %"obj::Any") local_unnamed_addr #0 !dbg !6 {
top:
%obj = alloca {} addrspace(10)*, align 8
%pgcstack = call {}*** @julia.get_pgcstack()
store {} addrspace(10)* null, {} addrspace(10)** %obj, align 8
%0 = bitcast {}*** %pgcstack to {}**
%current_task = getelementptr inbounds {}*, {}** %0, i64 -14
%1 = bitcast {}** %current_task to i64*
%world_age = getelementptr inbounds i64, i64* %1, i64 15
store {} addrspace(10)* %"obj::Any", {} addrspace(10)** %obj, align 8
%2 = load {} addrspace(10)*, {} addrspace(10)** %obj, align 8, !dbg !9, !nonnull !0
%3 = addrspacecast {} addrspace(10)* %2 to {} addrspace(11)*, !dbg !9
%4 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %3) #1, !dbg !9
%5 = load {}*, {}** @"*Core.Intrinsics.bitcast#1", align 8, !dbg !12, !tbaa !16, !invariant.load !0, !alias.scope !20, !noalias !23
%6 = bitcast {}* %5 to {} addrspace(10)**, !dbg !12
%7 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)** %6, i64 0, !dbg !12
%8 = ptrtoint {}* %4 to i64, !dbg !12
%9 = bitcast {}* %4 to i64*, !dbg !28
%10 = getelementptr inbounds i64, i64* %9, i64 0, !dbg !28
%11 = load i64, i64* %10, align 8, !dbg !28, !tbaa !29, !alias.scope !31, !noalias !32
ret i64 %11, !dbg !11
}
Linking modules like that together, where there may be multiple GVs like that, can lead to:
error: Linking globals named '*Core.Intrinsics.bitcast#1': symbol multiply defined!
@pchintalapudi mentions that Base handles this via deduplication in codegen_params' global_targets map. None of that is exposed to us though, so a possible workaround would be to change those GVs to private/internal so that they get deduplicated when linking (IIUC). That would prevent passing the IR to jl_create_native
though; @vchuravy is that a problem for Enzyme (or other CPU-based compilers that build on GPUCompiler.jl)?
Metadata
Metadata
Assignees
Labels
No labels