Skip to content

Conversation

vchuravy
Copy link
Member

Partially reverts #49391

PrecompileTools uses the timing infrastructure to snoop on the inference process.
The reason for #49391 was that this could lead to accidental pollution of the caches
with foreign results (timholy/SnoopCompile.jl#338)

After #52233 and especially #53336 we now filter results by cache owner and
don't try to cache foreign code using the native pipeline.

Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated
that a foreign code instance would not be cached without PrecompileTools.

@vchuravy vchuravy added gpu Affects running Julia on a GPU backport 1.11 Change should be backported to release-1.11 labels Apr 13, 2024
@vchuravy
Copy link
Member Author

I locally ported this to v1.11 and modified CUDA.jl:

diff --git a/Project.toml b/Project.toml
index db564395a..47217fe27 100644
--- a/Project.toml
+++ b/Project.toml
@@ -23,6 +23,7 @@ Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
 NVTX = "5da4648a-3479-48b8-97b9-01cb529c0a1f"
+PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
 Preferences = "21216c6a-2e73-6563-6e65-726566657250"
 PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
 Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
@@ -65,6 +66,7 @@ Libdl = "1"
 LinearAlgebra = "1"
 Logging = "1"
 NVTX = "0.3.2"
+PrecompileTools = "1.2.1"
 Preferences = "1"
 PrettyTables = "2"
 Printf = "1"
diff --git a/src/precompile.jl b/src/precompile.jl
index dc1b2ce2e..ddbe26dd2 100644
--- a/src/precompile.jl
+++ b/src/precompile.jl
@@ -14,3 +14,15 @@ precompile(run_and_collect, (Cmd,))
 precompile(cudaconvert, (Function,))
 precompile(Core.kwfunc(cudacall), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(cudacall),CuFunction,Type{Tuple{}}))
 precompile(Core.kwfunc(launch), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(launch),CuFunction))
+
+using PrecompileTools: @setup_workload, @compile_workload
+@setup_workload let
+    @compile_workload begin
+        target = PTXCompilerTarget(; cap=v"7.5")
+        params = CUDACompilerParams(; cap=v"7.5", ptx=v"7.5")
+        config = CompilerConfig(target, params)
+        mi = GPUCompiler.methodinstance(typeof(identity), Tuple{Nothing})
+        job = CompilerJob(mi, config)
+        GPUCompiler.code_native(devnull, job)
+    end
+end

This works correctly now.

@vchuravy vchuravy merged commit c0611e8 into master Apr 15, 2024
@vchuravy vchuravy deleted the vc/typeinf branch April 15, 2024 17:22
@aviatesk
Copy link
Member

Does that mean external absints using the integrated (internal) cache is now able to persist its codeinst caches across sessions?

@vchuravy
Copy link
Member Author

vchuravy commented Apr 16, 2024

They were already able to do so (after #52233), but the integration with PrecompileTools didn't work.

E.g. this test https://github.com/JuliaGPU/GPUCompiler.jl/blob/9f90e9d8c7b1049932083847cffcd442699c81c0/test/native_tests.jl#L594 was not working yet.

KristofferC pushed a commit that referenced this pull request Apr 17, 2024
…reters (#54069)

Partially reverts #49391

PrecompileTools uses the timing infrastructure to snoop on the inference
process.
The reason for #49391 was that this could lead to accidental pollution
of the caches
with foreign results
(timholy/SnoopCompile.jl#338)

After #52233 and especially #53336 we now filter results by cache owner
and
don't try to cache foreign code using the native pipeline.

Motivated by JuliaGPU/GPUCompiler.jl#567 which
demonstrated
that a foreign code instance would not be cached without
PrecompileTools.

(cherry picked from commit c0611e8)
@KristofferC KristofferC mentioned this pull request Apr 17, 2024
59 tasks
KristofferC added a commit that referenced this pull request May 28, 2024
Backported PRs:
- [x] #53665 <!-- use afoldl instead of tail recursion for tuples -->
- [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error
messages -->
- [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector -->
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign
abstract interpreters -->
- [x] #53750 <!-- inference correctness: fields and globals can revert
to undef -->
- [x] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray
-->
- [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer,
typemax(Int64))` -->
- [x] #54013 <!-- Support case-changes to Annotated{String,Char}s -->
- [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer -->
- [x] #54137 <!-- Fix typo in docs for `partialsortperm` -->
- [x] #54129 <!-- use correct size when creating output data from an
IOBuffer -->
- [x] #54153 <!-- Fixup IdSet docstring -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54213 <!-- Add `public` statement to `Base.GC` -->
- [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions.
-->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and
overflow. -->
- [x] #54259 <!-- Fix typo in `readuntil` -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->
- [x] #54248 <!-- ensure package callbacks are invoked when no valid
precompile file exists for an "auto loaded" stdlib -->
- [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show -->
- [x] #54302 <!-- Specialised substring equality for annotated strs -->
- [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a
single package -->
- [x] #54350 <!-- add a precompile signature to Artifacts code that is
used by JLLs -->
- [x] #54331 <!-- correctly track freed bytes in
jl_genericmemory_to_string -->
- [x] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [x] #54335 <!-- When accessing the data pointer for an array, first
decay it to a Derived Pointer -->
- [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}`
-->
- [x] #54288
- [x] #54067
- [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} -->
- [x] #54289 <!-- Rework annotation ordering/optimisations -->
- [x] #53815 <!-- create phantom task for GC threads -->
- [x] #54130 <!-- inference: handle `LimitedAccuracy` in
`handle_global_assignment!` -->
- [x] #54428 <!-- Move ConsoleLogging.jl into Base -->
- [x] #54332 <!-- Revert "add unsetindex support to more copyto methods
(#51760)" -->
- [x] #53826 <!-- Make all command-line options documented in all
related files -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->
- [x] #54499 <!-- make `@doc x` work without REPL loaded -->
- [x] #54210 <!-- attach finalizer in `mmap` to the correct object -->
- [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #54323 <!-- inference: fix too conservative effects for recursive
cycles -->
- [ ] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [ ] #54191 <!-- make `AbstractPipe` public -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53882 <!-- Warn about cycles in extension precompilation -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.11 Change should be backported to release-1.11 label May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu Affects running Julia on a GPU
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants