Skip to content

Compilation time regression on nightly due to #41099 #50082

Closed
@benlorenz

Description

@benlorenz

I have been trying to debug a regression in the compilation time for many tests of Polymake.jl, it started with this PkgEval report https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2023-04/27/report.html where the tests jumped from about 13 minutes to a timeout (>45min). The tests are mostly checking compatibility of our own vector and matrix types with the corresponding julia types (for different element types). These types are mostly defined via CxxWrap and a corresponding libcxxwrap module.
In our own CI we see similar effects with nightly taking more than twice as long for the tests as 1.8 or 1.9.

I added a @time to several of the affected test groups and the output looks like this:

 22.716351 seconds (3.42 M allocations: 228.784 MiB, 0.16% gc time, 99.98% compilation time)

So this is indeed compilation time and not related to the actual computations. That timing was for this code block.

I bisected the regression to #41099 by @BioTurboNick but I don't really know how to debug this further.
In these timings below I noticed most of the extra time seems to be spent in AST_COMPRESS, which went from 4% to 51%. CODEGEN_LLVM also went up a bit. But I don't really know what to make of this now, especially why or how our code is triggering this in combination with that PR.

Timings

I ran some tests on a nightly build with WITH_TIMING_COUNTS=1. I added some extra verbosity to the test sets to see more timings and disabled several test sets which do not exhibit this problem.

This is for the current master (Commit f407a4c):

JULIA TIMINGS
Event                    , Self Cycles (% of Total)      , Total Cycles (% of Total)     
ADD_METHOD               ,           3514143189 ( 0.08 %),           3665570689 ( 0.08 %)
AST_COMPRESS             ,        2348395690639 (51.58 %),        2348395690639 (51.58 %)
AST_UNCOMPRESS           ,          15147700000 ( 0.33 %),          15378217751 ( 0.34 %)
CODEGEN_Codeinst         ,             98448269 ( 0.00 %),         265528616248 ( 5.83 %)
CODEGEN_LLVM             ,         264305217253 ( 5.81 %),         264865339231 ( 5.82 %)
CODEGEN_Workqueue        ,            139461127 ( 0.00 %),          12221039711 ( 0.27 %)
CODEINST_COMPILE         ,           1192903873 ( 0.03 %),        2043389338089 (44.88 %)
DL_OPEN                  ,             78957620 ( 0.00 %),             78957620 ( 0.00 %)
GC                       ,             47382498 ( 0.00 %),          15914524530 ( 0.35 %)
GC_Finalizers            ,            265416755 ( 0.01 %),            299007204 ( 0.01 %)
GC_FullSweep             ,            105122446 ( 0.00 %),            105122446 ( 0.00 %)
GC_IncrementalSweep      ,           7270270064 ( 0.16 %),           7270270064 ( 0.16 %)
GC_Mark                  ,           8192644545 ( 0.18 %),           8192644545 ( 0.18 %)
GC_Stop                  ,                97773 ( 0.00 %),                97773 ( 0.00 %)
INFERENCE                ,          86903454075 ( 1.91 %),        2482940830167 (54.54 %)
INIT_MODULE              ,           8107964312 ( 0.18 %),          19631247198 ( 0.43 %)
JULIA_INIT               ,             65782478 ( 0.00 %),            237956976 ( 0.01 %)
LLVM_OPT                 ,        1059662566154 (23.28 %),        1059662566154 (23.28 %)
LLVM_ORC                 ,         716882046556 (15.75 %),        1776544612710 (39.02 %)
LOAD_Pkgimg              ,            860054924 ( 0.02 %),           7402571632 ( 0.16 %)
LOAD_Processor           ,              1568929 ( 0.00 %),              1568929 ( 0.00 %)
LOAD_Require             ,            266486681 ( 0.01 %),          28020265123 ( 0.62 %)
LOAD_Sysimg              ,            140595865 ( 0.00 %),            140595865 ( 0.00 %)
LOWERING                 ,           4938725087 ( 0.11 %),           6081574799 ( 0.13 %)
MACRO_INVOCATION         ,             20986884 ( 0.00 %),            876916164 ( 0.02 %)
METHOD_LOOKUP_FAST       ,           3206808468 ( 0.07 %),           3986616164 ( 0.09 %)
METHOD_LOOKUP_SLOW       ,             91857246 ( 0.00 %),            875939345 ( 0.02 %)
METHOD_MATCH             ,          15893573564 ( 0.35 %),          16805438247 ( 0.37 %)
PARSING                  ,            691756980 ( 0.02 %),            691756980 ( 0.02 %)
ROOT                     ,           2482796762 ( 0.05 %),        4552689821858 (100.00 %)
STAGED_FUNCTION          ,              1190547 ( 0.00 %),            300850387 ( 0.01 %)
TYPE_CACHE_INSERT        ,            601778215 ( 0.01 %),            601778215 ( 0.01 %)
TYPE_CACHE_LOOKUP        ,           2759831421 ( 0.06 %),           2759831421 ( 0.06 %)
VERIFY_Edges             ,            328049638 ( 0.01 %),           2308738847 ( 0.05 %)
VERIFY_Graph             ,             20569865 ( 0.00 %),             20569865 ( 0.00 %)
VERIFY_Methods           ,              7921156 ( 0.00 %),              7921156 ( 0.00 %)

JULIA COUNTERS
Counter                  , Value               
Invalidations            ,                 2333
HeapSize                 ,            744737644
JITSize                  ,             33254347
JITCodeSize              ,             31432685
JITDataSize              ,              1821662
ImageSize                ,            395643680

And from another build where I reverted #41099 (and three other commits which I needed to revert to avoid conflicts, see at the end of this issue [1]):

JULIA TIMINGS
Event                    , Self Cycles (% of Total)      , Total Cycles (% of Total)     
ADD_METHOD               ,           3672342454 ( 0.17 %),           3875313788 ( 0.18 %)
AST_COMPRESS             ,          84232107470 ( 3.93 %),          84232107470 ( 3.93 %)
AST_UNCOMPRESS           ,          13816531605 ( 0.64 %),          14016252819 ( 0.65 %)
CODEGEN_Codeinst         ,             89608396 ( 0.00 %),          67777565810 ( 3.16 %)
CODEGEN_LLVM             ,          66328775206 ( 3.10 %),          67204589383 ( 3.14 %)
CODEGEN_Workqueue        ,            131226434 ( 0.01 %),           7151191775 ( 0.33 %)
CODEINST_COMPILE         ,           1128842086 ( 0.05 %),        1899161320883 (88.64 %)
DL_OPEN                  ,             82986960 ( 0.00 %),             82986960 ( 0.00 %)
GC                       ,             48344120 ( 0.00 %),          15536714911 ( 0.73 %)
GC_Finalizers            ,            239514645 ( 0.01 %),            239537348 ( 0.01 %)
GC_FullSweep             ,            110201791 ( 0.01 %),            110201791 ( 0.01 %)
GC_IncrementalSweep      ,           7615966878 ( 0.36 %),           7615966878 ( 0.36 %)
GC_Mark                  ,           7522571182 ( 0.35 %),           7522571182 ( 0.35 %)
GC_Stop                  ,                93592 ( 0.00 %),                93592 ( 0.00 %)
INFERENCE                ,          84077627794 ( 3.92 %),         215491286569 (10.06 %)
INIT_MODULE              ,           8636965446 ( 0.40 %),          20269670034 ( 0.95 %)
JULIA_INIT               ,             64656364 ( 0.00 %),            228045797 ( 0.01 %)
LLVM_OPT                 ,        1091348663673 (50.93 %),        1091348663673 (50.93 %)
LLVM_ORC                 ,         738791901545 (34.48 %),        1830140565218 (85.42 %)
LOAD_Pkgimg              ,            865851913 ( 0.04 %),           7686754456 ( 0.36 %)
LOAD_Processor           ,              1495812 ( 0.00 %),              1495812 ( 0.00 %)
LOAD_Require             ,            234657082 ( 0.01 %),          28759724671 ( 1.34 %)
LOAD_Sysimg              ,            131844838 ( 0.01 %),            131844838 ( 0.01 %)
LOWERING                 ,           5515049860 ( 0.26 %),           6406164444 ( 0.30 %)
MACRO_INVOCATION         ,             22419703 ( 0.00 %),            891114584 ( 0.04 %)
METHOD_LOOKUP_FAST       ,           3087070252 ( 0.14 %),           3905972971 ( 0.18 %)
METHOD_LOOKUP_SLOW       ,             88961303 ( 0.00 %),            872603509 ( 0.04 %)
METHOD_MATCH             ,          17020788685 ( 0.79 %),          17815537140 ( 0.83 %)
PARSING                  ,            757186245 ( 0.04 %),            757186245 ( 0.04 %)
ROOT                     ,           3265830601 ( 0.15 %),        2142638130876 (100.00 %)
STAGED_FUNCTION          ,              1308655 ( 0.00 %),            309384130 ( 0.01 %)
TYPE_CACHE_INSERT        ,            619835544 ( 0.03 %),            619835544 ( 0.03 %)
TYPE_CACHE_LOOKUP        ,           2722668821 ( 0.13 %),           2722668821 ( 0.13 %)
VERIFY_Edges             ,            331854785 ( 0.02 %),           2309098244 ( 0.11 %)
VERIFY_Graph             ,             24758419 ( 0.00 %),             24758419 ( 0.00 %)
VERIFY_Methods           ,              7620717 ( 0.00 %),              7620717 ( 0.00 %)

JULIA COUNTERS
Counter                  , Value               
Invalidations            ,                 2338
HeapSize                 ,            703725988
JITSize                  ,             33254191
JITCodeSize              ,             31432569
JITDataSize              ,              1821622
ImageSize                ,            375575800

The corresponding test timings for one test group which illustrates the problem quite well. This is for master:

  Polymake.SparseMatrix                                                         | 1757   1757  12m28.0s
    Constructors/Converts                                                       |  592    592     12.2s
    Low-level operations                                                        |  340    340     16.5s
      Polymake.SparseMatrix{Int64}                                              |   39     39      0.3s
      Polymake.SparseMatrix{Polymake.Integer}                                   |   39     39      0.2s
      Polymake.SparseMatrix{Polymake.Rational}                                  |   39     39      0.4s
      Polymake.SparseMatrix{Float64}                                            |   89     89      0.3s
      Polymake.SparseMatrix{Polymake.QuadraticExtension{Polymake.Rational}}     |   39     39      0.3s
      Polymake.SparseMatrix{Polymake.OscarNumber}                               |   39     39      0.2s
      Equality                                                                  |   56     56     14.8s
    Arithmetic                                                                  |  786    786  11m58.5s
      Arithmetic1                                                               |  310    310   3m04.8s
      Arithmetic2                                                               |  260    260   2m45.7s
      Arithmetic3                                                               |   39     39     25.3s
      Arithmetic4                                                               |  140    140   5m09.4s
        Arithmetic4 Int32                                                       |   10     10     17.2s
        Arithmetic4 Int64                                                       |   10     10     15.3s
        Arithmetic4 UInt64                                                      |   10     10     19.8s
        Arithmetic4 BigInt                                                      |   10     10     21.1s
        Arithmetic4 Polymake.Integer                                            |   10     10     17.0s
        Arithmetic4 Rational{Int32}                                             |   10     10     15.9s
        Arithmetic4 Rational{Int64}                                             |   10     10     16.1s
        Arithmetic4 Rational{UInt64}                                            |   10     10     16.7s
        Arithmetic4 Rational{BigInt}                                            |   10     10     23.9s
        Arithmetic4 Polymake.Rational                                           |   10     10     16.2s
        Arithmetic4 Float32                                                     |   10     10     24.2s
        Arithmetic4 Float64                                                     |   10     10     35.9s
        Arithmetic4 BigFloat                                                    |   10     10     35.7s
        Arithmetic4 Polymake.QuadraticExtension{Polymake.Rational}              |   10     10     34.3s
      Arithmetic5                                                               |   20     20     26.0s
    findnz                                                                      |    3      3      0.4s

And with the reverted commits:

  Polymake.SparseMatrix                                                         | 1757   1757   5m00.6s
    Constructors/Converts                                                       |  592    592      7.9s
    Low-level operations                                                        |  340    340     10.9s
      Polymake.SparseMatrix{Int64}                                              |   39     39      0.3s
      Polymake.SparseMatrix{Polymake.Integer}                                   |   39     39      0.3s
      Polymake.SparseMatrix{Polymake.Rational}                                  |   39     39      0.3s
      Polymake.SparseMatrix{Float64}                                            |   89     89      0.3s
      Polymake.SparseMatrix{Polymake.QuadraticExtension{Polymake.Rational}}     |   39     39      0.3s
      Polymake.SparseMatrix{Polymake.OscarNumber}                               |   39     39      0.2s
      Equality                                                                  |   56     56      9.3s
    Arithmetic                                                                  |  786    786   4m41.0s
      Arithmetic1                                                               |  310    310   1m09.0s
      Arithmetic2                                                               |  260    260   1m08.1s
      Arithmetic3                                                               |   39     39      9.4s
      Arithmetic4                                                               |  140    140   2m02.6s
        Arithmetic4 Int32                                                       |   10     10      8.1s
        Arithmetic4 Int64                                                       |   10     10      7.0s
        Arithmetic4 UInt64                                                      |   10     10      6.8s
        Arithmetic4 BigInt                                                      |   10     10      6.8s
        Arithmetic4 Polymake.Integer                                            |   10     10      9.1s
        Arithmetic4 Rational{Int32}                                             |   10     10     10.6s
        Arithmetic4 Rational{Int64}                                             |   10     10      9.2s
        Arithmetic4 Rational{UInt64}                                            |   10     10      6.6s
        Arithmetic4 Rational{BigInt}                                            |   10     10      6.2s
        Arithmetic4 Polymake.Rational                                           |   10     10      6.6s
        Arithmetic4 Float32                                                     |   10     10      8.5s
        Arithmetic4 Float64                                                     |   10     10     12.2s
        Arithmetic4 BigFloat                                                    |   10     10     11.8s
        Arithmetic4 Polymake.QuadraticExtension{Polymake.Rational}              |   10     10     13.1s
      Arithmetic5                                                               |   20     20      8.8s
    findnz                                                                      |    3      3      0.3s

Other packages

While looking at the PkgEval report mentioned earlier I noticed that MultivariatePolynomials.jl might have similar issues, at least for some of its tests:

MutableArithmetics with polynomials in Int64 | 6478      14   6492  1m09.8s
MutableArithmetics with polynomials in Int64 | 6478      14   6492  3m23.8s

before and after, from the second block with MutableArithmetics, the last one before the tests were killed.

additional notes:

  • [1] reverted commits:
    * 971c2253ee - (bl/withsomereverts) Benjamin Lorenz : Revert "Lookup metadata for inlined frames for stacktraces (#41099)" (19 hours ago)
    * d450ad31e3 - Benjamin Lorenz : Revert "limit printing depth of argument types in stack traces (#49795)" (19 hours ago)
    * 173d8d4413 - Benjamin Lorenz : Revert "refactor `Base.StackTraces.show_spec_linfo` (#49951)" (19 hours ago)
    * 57d11b3b5d - Benjamin Lorenz : Revert "make `IRShow.method_name` inferrable (#49607)" (19 hours ago)
    
    I also checked that just reverting the other three commits behaves similar to the current master without the reverts to make sure that Lookup metadata for inlined frames for stack traces #41099 is the key change.
  • I did most tests with julia nightly builds not on Polymake.jl master but on a custom branch but this should not affect these results, the original PkgEval report was referring to Polymake.jl release v0.9.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    regressionRegression in behavior compared to a previous version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions