Skip to content

[MC] Compiler performance regression in Clang 19 with -mbranches-within-32B-boundaries #107754

Open
@vient

Description

@vient

I'm building the same code with clang 18 and 19, and noticed that some target build times are disproportionately affected by switching to new compiler - in general Clang 19 is 5-10% slower but an LTO build of one particular target slowed down x2.5

Tried --time-trace but don't know what to make of it other than that OptModule got some long tails in Clang 19. First worker under main thread is building the same module in both images so can be directly compared - OptModule time increased from 1m20s to 5m24s, x4
image

913.621213 Total OptModule
856.716409 Total OptFunction
856.192565 Total RunPass
556.340514 Total PassManager<Function>
512.635569 Total ModuleInlinerWrapperPass
510.885891 Total ModuleToPostOrderCGSCCPassAdaptor
509.09462 Total DevirtSCCRepeatedPass
507.621024 Total PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, LazyCallGraph &, CGSCCUpdateResult &>
434.932548 Total CGSCCToFunctionPassAdaptor
142.495075 Total ExecuteLinker
142.421367 Total Link
141.506523 Total LTO
132.923099 Total InstCombinePass
124.003487 Total ModuleToFunctionPassAdaptor

image

3237.53794 Total OptModule
845.04484 Total OptFunction
844.38391 Total RunPass
552.922664 Total PassManager<Function>
497.867448 Total ModuleInlinerWrapperPass
495.840083 Total ModuleToPostOrderCGSCCPassAdaptor
493.816647 Total DevirtSCCRepeatedPass
492.195245 Total PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, LazyCallGraph &, CGSCCUpdateResult &>
417.747014 Total CGSCCToFunctionPassAdaptor
385.505297 Total ExecuteLinker
385.437975 Total Link
384.301031 Total LTO
141.092082 Total InstCombinePass
137.907089 Total ModuleToFunctionPassAdaptor

perf trace and manual breaking in gdb show that a lot of time is spent around

llvm::MCAssembler::layout() ()
llvm::MCObjectStreamer::finishImpl() ()
llvm::MCELFStreamer::finishImpl() ()
llvm::AsmPrinter::doFinalization(llvm::Module&) ()
llvm::FPPassManager::doFinalization(llvm::Module&) ()
llvm::legacy::PassManagerImpl::run(llvm::Module&) ()

and also llvm::MCExpr::evaluateAsRelocatableImpl. My current build is stripped though, I'll return back with trace results with debug symbols later.

Metadata

Metadata

Assignees

No one assigned

    Labels

    LTOLink time optimization (regular/full LTO or ThinLTO)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions