Skip to content

Make compilation faster πŸš€ Β #4165

@kripken

Description

@kripken

I did some investigation on the speed of compilation. This is getting important because of things like j2cl output which is a 50MB real-world file.

Slowest passes: (in -O3; note that some passes run more than once)

[PassRunner]   running pass: precompute-propagate...           14.23460 seconds.
[PassRunner]   running pass: dae...                            10.91160 seconds.
[PassRunner]   running pass: heap2local...                     10.44350 seconds.
[PassRunner]   running pass: precompute-propagate...            9.07261 seconds.
[PassRunner]   running pass: inlining...                        8.02837 seconds.
[PassRunner]   running pass: once-reduction...                  8.02484 seconds.
[PassRunner]   running pass: vacuum...                          5.16162 seconds.
[PassRunner]   running pass: vacuum...                          3.64088 seconds.
[PassRunner]   running pass: vacuum...                          3.58191 seconds.
[PassRunner]   running pass: vacuum...                          3.61897 seconds.

perf reports almost 15% of time is spent in malloc/free methods - quite a lot. I counted how many mallocs (note: not the size of them) are done in each pass, here are the heaviest (note: timings here include the overhead of counting mallocs atomically among threads, so those matter less):

    once-reduction...                 9.51466 seconds.  141558196 mallocs. 
    precompute-propagate...           14.7426 seconds.   61611533 mallocs. 
    dae...                            11.3789 seconds.   53792948 mallocs. 
    precompute-propagate...           9.69919 seconds.   39531493 mallocs. 
    heap2local...                     10.5116 seconds.   38582302 mallocs. 
    local-cse...                      4.70765 seconds.   33454468 mallocs. 
    cfp...                            2.43619 seconds.   32292682 mallocs. 
    ssa-nomerge...                    4.92099 seconds.   26378117 mallocs. 
    simplify-locals-nostructure...    6.03281 seconds.   22663794 mallocs. 
    vacuum...                         5.10813 seconds.   23880789 mallocs. 
    inlining...                       107.781 seconds.   20940365 mallocs. 
    dce...                            3.84722 seconds.   20025273 mallocs. 
    local-subtyping...                3.44374 seconds.   19067529 mallocs. 

and vacuum runs 3 more times:

    vacuum...                         3.71648 seconds.   16648663 mallocs. 
    vacuum...                         3.86917 seconds.   16302767 mallocs. 
    vacuum...                         3.89855 seconds.   16162915 mallocs. 

Help in speeding things up would be very welcome!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions