Description
I have been trying to debug a regression in the compilation time for many tests of Polymake.jl, it started with this PkgEval report https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2023-04/27/report.html where the tests jumped from about 13 minutes to a timeout (>45min). The tests are mostly checking compatibility of our own vector and matrix types with the corresponding julia types (for different element types). These types are mostly defined via CxxWrap and a corresponding libcxxwrap module.
In our own CI we see similar effects with nightly taking more than twice as long for the tests as 1.8 or 1.9.
I added a @time
to several of the affected test groups and the output looks like this:
22.716351 seconds (3.42 M allocations: 228.784 MiB, 0.16% gc time, 99.98% compilation time)
So this is indeed compilation time and not related to the actual computations. That timing was for this code block.
I bisected the regression to #41099 by @BioTurboNick but I don't really know how to debug this further.
In these timings below I noticed most of the extra time seems to be spent in AST_COMPRESS
, which went from 4% to 51%. CODEGEN_LLVM
also went up a bit. But I don't really know what to make of this now, especially why or how our code is triggering this in combination with that PR.
Timings
I ran some tests on a nightly build with WITH_TIMING_COUNTS=1
. I added some extra verbosity to the test sets to see more timings and disabled several test sets which do not exhibit this problem.
This is for the current master (Commit f407a4c):
JULIA TIMINGS
Event , Self Cycles (% of Total) , Total Cycles (% of Total)
ADD_METHOD , 3514143189 ( 0.08 %), 3665570689 ( 0.08 %)
AST_COMPRESS , 2348395690639 (51.58 %), 2348395690639 (51.58 %)
AST_UNCOMPRESS , 15147700000 ( 0.33 %), 15378217751 ( 0.34 %)
CODEGEN_Codeinst , 98448269 ( 0.00 %), 265528616248 ( 5.83 %)
CODEGEN_LLVM , 264305217253 ( 5.81 %), 264865339231 ( 5.82 %)
CODEGEN_Workqueue , 139461127 ( 0.00 %), 12221039711 ( 0.27 %)
CODEINST_COMPILE , 1192903873 ( 0.03 %), 2043389338089 (44.88 %)
DL_OPEN , 78957620 ( 0.00 %), 78957620 ( 0.00 %)
GC , 47382498 ( 0.00 %), 15914524530 ( 0.35 %)
GC_Finalizers , 265416755 ( 0.01 %), 299007204 ( 0.01 %)
GC_FullSweep , 105122446 ( 0.00 %), 105122446 ( 0.00 %)
GC_IncrementalSweep , 7270270064 ( 0.16 %), 7270270064 ( 0.16 %)
GC_Mark , 8192644545 ( 0.18 %), 8192644545 ( 0.18 %)
GC_Stop , 97773 ( 0.00 %), 97773 ( 0.00 %)
INFERENCE , 86903454075 ( 1.91 %), 2482940830167 (54.54 %)
INIT_MODULE , 8107964312 ( 0.18 %), 19631247198 ( 0.43 %)
JULIA_INIT , 65782478 ( 0.00 %), 237956976 ( 0.01 %)
LLVM_OPT , 1059662566154 (23.28 %), 1059662566154 (23.28 %)
LLVM_ORC , 716882046556 (15.75 %), 1776544612710 (39.02 %)
LOAD_Pkgimg , 860054924 ( 0.02 %), 7402571632 ( 0.16 %)
LOAD_Processor , 1568929 ( 0.00 %), 1568929 ( 0.00 %)
LOAD_Require , 266486681 ( 0.01 %), 28020265123 ( 0.62 %)
LOAD_Sysimg , 140595865 ( 0.00 %), 140595865 ( 0.00 %)
LOWERING , 4938725087 ( 0.11 %), 6081574799 ( 0.13 %)
MACRO_INVOCATION , 20986884 ( 0.00 %), 876916164 ( 0.02 %)
METHOD_LOOKUP_FAST , 3206808468 ( 0.07 %), 3986616164 ( 0.09 %)
METHOD_LOOKUP_SLOW , 91857246 ( 0.00 %), 875939345 ( 0.02 %)
METHOD_MATCH , 15893573564 ( 0.35 %), 16805438247 ( 0.37 %)
PARSING , 691756980 ( 0.02 %), 691756980 ( 0.02 %)
ROOT , 2482796762 ( 0.05 %), 4552689821858 (100.00 %)
STAGED_FUNCTION , 1190547 ( 0.00 %), 300850387 ( 0.01 %)
TYPE_CACHE_INSERT , 601778215 ( 0.01 %), 601778215 ( 0.01 %)
TYPE_CACHE_LOOKUP , 2759831421 ( 0.06 %), 2759831421 ( 0.06 %)
VERIFY_Edges , 328049638 ( 0.01 %), 2308738847 ( 0.05 %)
VERIFY_Graph , 20569865 ( 0.00 %), 20569865 ( 0.00 %)
VERIFY_Methods , 7921156 ( 0.00 %), 7921156 ( 0.00 %)
JULIA COUNTERS
Counter , Value
Invalidations , 2333
HeapSize , 744737644
JITSize , 33254347
JITCodeSize , 31432685
JITDataSize , 1821662
ImageSize , 395643680
And from another build where I reverted #41099 (and three other commits which I needed to revert to avoid conflicts, see at the end of this issue [1]):
JULIA TIMINGS
Event , Self Cycles (% of Total) , Total Cycles (% of Total)
ADD_METHOD , 3672342454 ( 0.17 %), 3875313788 ( 0.18 %)
AST_COMPRESS , 84232107470 ( 3.93 %), 84232107470 ( 3.93 %)
AST_UNCOMPRESS , 13816531605 ( 0.64 %), 14016252819 ( 0.65 %)
CODEGEN_Codeinst , 89608396 ( 0.00 %), 67777565810 ( 3.16 %)
CODEGEN_LLVM , 66328775206 ( 3.10 %), 67204589383 ( 3.14 %)
CODEGEN_Workqueue , 131226434 ( 0.01 %), 7151191775 ( 0.33 %)
CODEINST_COMPILE , 1128842086 ( 0.05 %), 1899161320883 (88.64 %)
DL_OPEN , 82986960 ( 0.00 %), 82986960 ( 0.00 %)
GC , 48344120 ( 0.00 %), 15536714911 ( 0.73 %)
GC_Finalizers , 239514645 ( 0.01 %), 239537348 ( 0.01 %)
GC_FullSweep , 110201791 ( 0.01 %), 110201791 ( 0.01 %)
GC_IncrementalSweep , 7615966878 ( 0.36 %), 7615966878 ( 0.36 %)
GC_Mark , 7522571182 ( 0.35 %), 7522571182 ( 0.35 %)
GC_Stop , 93592 ( 0.00 %), 93592 ( 0.00 %)
INFERENCE , 84077627794 ( 3.92 %), 215491286569 (10.06 %)
INIT_MODULE , 8636965446 ( 0.40 %), 20269670034 ( 0.95 %)
JULIA_INIT , 64656364 ( 0.00 %), 228045797 ( 0.01 %)
LLVM_OPT , 1091348663673 (50.93 %), 1091348663673 (50.93 %)
LLVM_ORC , 738791901545 (34.48 %), 1830140565218 (85.42 %)
LOAD_Pkgimg , 865851913 ( 0.04 %), 7686754456 ( 0.36 %)
LOAD_Processor , 1495812 ( 0.00 %), 1495812 ( 0.00 %)
LOAD_Require , 234657082 ( 0.01 %), 28759724671 ( 1.34 %)
LOAD_Sysimg , 131844838 ( 0.01 %), 131844838 ( 0.01 %)
LOWERING , 5515049860 ( 0.26 %), 6406164444 ( 0.30 %)
MACRO_INVOCATION , 22419703 ( 0.00 %), 891114584 ( 0.04 %)
METHOD_LOOKUP_FAST , 3087070252 ( 0.14 %), 3905972971 ( 0.18 %)
METHOD_LOOKUP_SLOW , 88961303 ( 0.00 %), 872603509 ( 0.04 %)
METHOD_MATCH , 17020788685 ( 0.79 %), 17815537140 ( 0.83 %)
PARSING , 757186245 ( 0.04 %), 757186245 ( 0.04 %)
ROOT , 3265830601 ( 0.15 %), 2142638130876 (100.00 %)
STAGED_FUNCTION , 1308655 ( 0.00 %), 309384130 ( 0.01 %)
TYPE_CACHE_INSERT , 619835544 ( 0.03 %), 619835544 ( 0.03 %)
TYPE_CACHE_LOOKUP , 2722668821 ( 0.13 %), 2722668821 ( 0.13 %)
VERIFY_Edges , 331854785 ( 0.02 %), 2309098244 ( 0.11 %)
VERIFY_Graph , 24758419 ( 0.00 %), 24758419 ( 0.00 %)
VERIFY_Methods , 7620717 ( 0.00 %), 7620717 ( 0.00 %)
JULIA COUNTERS
Counter , Value
Invalidations , 2338
HeapSize , 703725988
JITSize , 33254191
JITCodeSize , 31432569
JITDataSize , 1821622
ImageSize , 375575800
The corresponding test timings for one test group which illustrates the problem quite well. This is for master:
Polymake.SparseMatrix | 1757 1757 12m28.0s
Constructors/Converts | 592 592 12.2s
Low-level operations | 340 340 16.5s
Polymake.SparseMatrix{Int64} | 39 39 0.3s
Polymake.SparseMatrix{Polymake.Integer} | 39 39 0.2s
Polymake.SparseMatrix{Polymake.Rational} | 39 39 0.4s
Polymake.SparseMatrix{Float64} | 89 89 0.3s
Polymake.SparseMatrix{Polymake.QuadraticExtension{Polymake.Rational}} | 39 39 0.3s
Polymake.SparseMatrix{Polymake.OscarNumber} | 39 39 0.2s
Equality | 56 56 14.8s
Arithmetic | 786 786 11m58.5s
Arithmetic1 | 310 310 3m04.8s
Arithmetic2 | 260 260 2m45.7s
Arithmetic3 | 39 39 25.3s
Arithmetic4 | 140 140 5m09.4s
Arithmetic4 Int32 | 10 10 17.2s
Arithmetic4 Int64 | 10 10 15.3s
Arithmetic4 UInt64 | 10 10 19.8s
Arithmetic4 BigInt | 10 10 21.1s
Arithmetic4 Polymake.Integer | 10 10 17.0s
Arithmetic4 Rational{Int32} | 10 10 15.9s
Arithmetic4 Rational{Int64} | 10 10 16.1s
Arithmetic4 Rational{UInt64} | 10 10 16.7s
Arithmetic4 Rational{BigInt} | 10 10 23.9s
Arithmetic4 Polymake.Rational | 10 10 16.2s
Arithmetic4 Float32 | 10 10 24.2s
Arithmetic4 Float64 | 10 10 35.9s
Arithmetic4 BigFloat | 10 10 35.7s
Arithmetic4 Polymake.QuadraticExtension{Polymake.Rational} | 10 10 34.3s
Arithmetic5 | 20 20 26.0s
findnz | 3 3 0.4s
And with the reverted commits:
Polymake.SparseMatrix | 1757 1757 5m00.6s
Constructors/Converts | 592 592 7.9s
Low-level operations | 340 340 10.9s
Polymake.SparseMatrix{Int64} | 39 39 0.3s
Polymake.SparseMatrix{Polymake.Integer} | 39 39 0.3s
Polymake.SparseMatrix{Polymake.Rational} | 39 39 0.3s
Polymake.SparseMatrix{Float64} | 89 89 0.3s
Polymake.SparseMatrix{Polymake.QuadraticExtension{Polymake.Rational}} | 39 39 0.3s
Polymake.SparseMatrix{Polymake.OscarNumber} | 39 39 0.2s
Equality | 56 56 9.3s
Arithmetic | 786 786 4m41.0s
Arithmetic1 | 310 310 1m09.0s
Arithmetic2 | 260 260 1m08.1s
Arithmetic3 | 39 39 9.4s
Arithmetic4 | 140 140 2m02.6s
Arithmetic4 Int32 | 10 10 8.1s
Arithmetic4 Int64 | 10 10 7.0s
Arithmetic4 UInt64 | 10 10 6.8s
Arithmetic4 BigInt | 10 10 6.8s
Arithmetic4 Polymake.Integer | 10 10 9.1s
Arithmetic4 Rational{Int32} | 10 10 10.6s
Arithmetic4 Rational{Int64} | 10 10 9.2s
Arithmetic4 Rational{UInt64} | 10 10 6.6s
Arithmetic4 Rational{BigInt} | 10 10 6.2s
Arithmetic4 Polymake.Rational | 10 10 6.6s
Arithmetic4 Float32 | 10 10 8.5s
Arithmetic4 Float64 | 10 10 12.2s
Arithmetic4 BigFloat | 10 10 11.8s
Arithmetic4 Polymake.QuadraticExtension{Polymake.Rational} | 10 10 13.1s
Arithmetic5 | 20 20 8.8s
findnz | 3 3 0.3s
Other packages
While looking at the PkgEval report mentioned earlier I noticed that MultivariatePolynomials.jl might have similar issues, at least for some of its tests:
MutableArithmetics with polynomials in Int64 | 6478 14 6492 1m09.8s
MutableArithmetics with polynomials in Int64 | 6478 14 6492 3m23.8s
before and after, from the second block with MutableArithmetics
, the last one before the tests were killed.
additional notes:
- [1] reverted commits:
I also checked that just reverting the other three commits behaves similar to the current master without the reverts to make sure that Lookup metadata for inlined frames for stack traces #41099 is the key change.
* 971c2253ee - (bl/withsomereverts) Benjamin Lorenz : Revert "Lookup metadata for inlined frames for stacktraces (#41099)" (19 hours ago) * d450ad31e3 - Benjamin Lorenz : Revert "limit printing depth of argument types in stack traces (#49795)" (19 hours ago) * 173d8d4413 - Benjamin Lorenz : Revert "refactor `Base.StackTraces.show_spec_linfo` (#49951)" (19 hours ago) * 57d11b3b5d - Benjamin Lorenz : Revert "make `IRShow.method_name` inferrable (#49607)" (19 hours ago)
- I did most tests with julia nightly builds not on Polymake.jl master but on a custom branch but this should not affect these results, the original PkgEval report was referring to Polymake.jl release v0.9.0.