Skip to content

Conversation

pchintalapudi
Copy link
Member

#44440 used jl_merge_module to merge ThreadSafeModules, but that function does not account for linkage type when merging and will not rename globals. Thus, we switch back to llvm::Linker which will take that into account. We may want to consider moving all of our uses of jl_merge_module to llvm::Linker.

Fixes #48093 with a new test.

@pchintalapudi pchintalapudi added the compiler:codegen Generation of LLVM IR and native code label Jan 3, 2023
@maleadt maleadt added backport 1.9 Change should be backported to release-1.9 bugfix This change fixes an existing bug labels Jan 3, 2023
@maleadt
Copy link
Member

maleadt commented Jan 3, 2023

Thanks for looking into this!

#44440 used jl_merge_module to merge ThreadSafeModules, but that function does not account for linkage type when merging and will not rename globals. Thus, we switch back to llvm::Linker which will take that into account.

Ah yes, I actually switched that over to the LLVM linker in #36576, but didn't realize #44440 switched it back.

Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes the CUDA.jl issue, thanks!

@maleadt maleadt mentioned this pull request Jan 4, 2023
6 tasks
@maleadt
Copy link
Member

maleadt commented Jan 4, 2023

Windows CI is broken everywhere, so unrelated to this PR.

@maleadt maleadt force-pushed the pc/llvm-link-modules branch from 79bccd8 to 903a7ce Compare January 5, 2023 08:58
@maleadt
Copy link
Member

maleadt commented Jan 5, 2023

Rebased and updated the test.

@vtjnash Given the restrictions with llvmcall, can you give your thoughts on how best to implement thread-local arrays for our GPU back-ends? Basically, we want to be able to:

function kernel(input::AbstractArray{T})
  buf1 = SharedArray{T}(undef, 1)
  buf2 = SharedArray{T}(undef, 1)
  ...
  return
end

i.e., we'd prefer to use functions not macros, but also the llvmcall needs to be produced by a generated function where we know about types and sizes (currently we forward the dims argument using Val to the generator function).

@maleadt
Copy link
Member

maleadt commented Jan 5, 2023

CI failures are unrelated.

@maleadt maleadt merged commit 35d1840 into master Jan 5, 2023
@maleadt maleadt deleted the pc/llvm-link-modules branch January 5, 2023 15:52
maleadt added a commit that referenced this pull request Jan 5, 2023
(cherry picked from commit 35d1840)

Co-authored-by: Tim Besard <tim.besard@gmail.com>
@maleadt maleadt mentioned this pull request Jan 5, 2023
41 tasks
@maleadt maleadt removed the backport 1.9 Change should be backported to release-1.9 label Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change in llvmcall module merging breaks GPU codegen relying on globals
4 participants