Pool more JIT resources to reduce memory usage/contention #44912

pchintalapudi · 2022-04-08T16:23:13Z

Rather than create a new TargetMachine/PassManager for every single compilation (which uses a lot of memory/construction+destruction time) or guarding a single one with a mutex (no parallelism), we can instead share PassManagers/TargetMachines between threads using a simple resource pool. This should hopefully reduce the latency impact in #44568 back to what it was before #44364.

Depends on #44605 for the resource pool implementation.

pchintalapudi · 2022-04-08T17:22:59Z

Master:

Core.Compiler ──── 79.6007 seconds

Sysimage built. Summary:
Total ───────  91.153872 seconds 
Base: ───────  36.468653 seconds 40.0078%
Stdlibs: ────  54.682599 seconds 59.9893%

Precompilation complete. Summary:
Total ─────── 164.120550 seconds
Generation ── 125.911190 seconds 76.7187%
Execution ───  38.209360 seconds 23.2813%

Performance counter stats for 'make':

        646,078.45 msec task-clock                #    1.050 CPUs utilized          
        20,418,033      context-switches          #    0.032 M/sec                  
             1,187      cpu-migrations            #    0.002 K/sec                  
         5,761,609      page-faults               #    0.009 M/sec                  
 1,599,064,891,345      cycles                    #    2.475 GHz                      (83.35%)
    76,776,955,901      stalled-cycles-frontend   #    4.80% frontend cycles idle     (83.33%)
   260,942,150,532      stalled-cycles-backend    #   16.32% backend cycles idle      (83.33%)
 2,327,913,819,437      instructions              #    1.46  insn per cycle         
                                                  #    0.11  stalled cycles per insn  (83.30%)
   445,446,327,289      branches                  #  689.462 M/sec                    (83.33%)
    11,266,910,833      branch-misses             #    2.53% of all branches          (83.35%)

     615.025500678 seconds time elapsed

     604.227931000 seconds user
      42.061896000 seconds sys

PR:

Core.Compiler ──── 82.7436 seconds

Sysimage built. Summary:
Total ───────  88.887377 seconds 
Base: ───────  35.441637 seconds 39.8725%
Stdlibs: ────  53.443663 seconds 60.1251%

Precompilation complete. Summary:
Total ─────── 159.817340 seconds
Generation ── 120.957179 seconds 75.6846%
Execution ───  38.860161 seconds 24.3154%

 Performance counter stats for 'make':

        647,618.57 msec task-clock                #    1.051 CPUs utilized          
        20,797,353      context-switches          #    0.032 M/sec                  
             1,288      cpu-migrations            #    0.002 K/sec                  
         7,444,536      page-faults               #    0.011 M/sec                  
 1,602,508,947,119      cycles                    #    2.474 GHz                      (83.36%)
    79,713,692,314      stalled-cycles-frontend   #    4.97% frontend cycles idle     (83.34%)
   268,331,644,788      stalled-cycles-backend    #   16.74% backend cycles idle      (83.34%)
 2,308,256,966,847      instructions              #    1.44  insn per cycle         
                                                  #    0.12  stalled cycles per insn  (83.31%)
   439,783,835,134      branches                  #  679.078 M/sec                    (83.31%)
    11,017,362,932      branch-misses             #    2.51% of all branches          (83.34%)

     615.907288141 seconds time elapsed

     601.166668000 seconds user
      46.685558000 seconds sys

vtjnash · 2022-04-11T16:14:13Z

FYI: it is not very effective to ask for review while the PR does not yet apply to master

pchintalapudi · 2022-04-11T20:42:46Z

Sorry about that, this PR should now be applicable to master directly.

vtjnash · 2022-04-11T20:50:54Z

src/jitlayers.cpp

+        OptimizerResultT operator()(orc::ThreadSafeModule TSM, orc::MaterializationResponsibility &R) {
+            TSM.withModuleDo([&](Module &M) {
+                uint64_t start_time = 0;
+                if (dump_llvm_opt_stream != NULL) {


It appears this might need some locking later (for dump_llvm_opt_stream)

This should be addressed in the latest commit in #44914 to lock around bundles of stream operations.

vtjnash · 2022-04-11T20:51:23Z

src/jitlayers.cpp

+                {
+                    (***PMs).run(M);
+                }


Suggested change

{

(***PMs).run(M);

}

(***PMs).run(M);

vtjnash

SGTM

pchintalapudi requested a review from vtjnash April 8, 2022 17:28

pchintalapudi added compiler:codegen Generation of LLVM IR and native code compiler:llvm For issues that relate to LLVM labels Apr 8, 2022

pchintalapudi mentioned this pull request Apr 8, 2022

Protect shared JIT variables from being modified unsafely #44914

Merged

pchintalapudi added 8 commits April 11, 2022 16:36

Pool pass managers

e178003

Condense IR pipeline

5da1b14

Scope resources better

4f86643

Pool compiler TargetMachines

643b549

Rename acquire/release

f56d75d

Lock note

1bdf920

Fix trailing whitespace

762ce48

Remove old note about PM_mutex

b1b92fb

pchintalapudi force-pushed the pc/jit-pool branch from 23c6838 to b1b92fb Compare April 11, 2022 20:37

vtjnash reviewed Apr 11, 2022

View reviewed changes

Address review

2627aec

pchintalapudi added the merge me PR is reviewed. Merge when all tests are passing label Apr 11, 2022

DilumAluthge merged commit c0c60e8 into master Apr 12, 2022

DilumAluthge deleted the pc/jit-pool branch April 12, 2022 01:59

DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Apr 12, 2022

pchintalapudi mentioned this pull request Apr 18, 2022

latency regression from #44364 #44568

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Pool more JIT resources to reduce memory usage/contention #44912

Pool more JIT resources to reduce memory usage/contention #44912

Uh oh!

pchintalapudi commented Apr 8, 2022

Uh oh!

pchintalapudi commented Apr 8, 2022

Uh oh!

vtjnash commented Apr 11, 2022

Uh oh!

pchintalapudi commented Apr 11, 2022

Uh oh!

vtjnash Apr 11, 2022

Uh oh!

pchintalapudi Apr 11, 2022

Uh oh!

vtjnash Apr 11, 2022

Uh oh!

vtjnash left a comment

Uh oh!

Uh oh!

Uh oh!

Pool more JIT resources to reduce memory usage/contention #44912

Pool more JIT resources to reduce memory usage/contention #44912

Uh oh!

Conversation

pchintalapudi commented Apr 8, 2022

Uh oh!

pchintalapudi commented Apr 8, 2022

Uh oh!

vtjnash commented Apr 11, 2022

Uh oh!

pchintalapudi commented Apr 11, 2022

Uh oh!

vtjnash Apr 11, 2022

Choose a reason for hiding this comment

Uh oh!

pchintalapudi Apr 11, 2022

Choose a reason for hiding this comment

Uh oh!

vtjnash Apr 11, 2022

Choose a reason for hiding this comment

Uh oh!

vtjnash left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!