Skip to content

[flang] Use precompiled headers in Frontend, Lower, Parser, Semantics and Evaluate #131137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mrkajetanp
Copy link
Contributor

@mrkajetanp mrkajetanp commented Mar 13, 2025

Precompiling larger headers can save a lot of compile time across various compilation units.

Selected compile time & memory improvements are as follows:

flang/lib/Parser/Fortran-parsers.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:47.31 -> 0:41.68
Maximum resident set size (kbytes): 2062140 -> 1745584

flang/lib/Lower/Bridge.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:19.16 -> 0:45.86
Maximum resident set size (kbytes): 3849144 -> 2443476

flang/lib/Lower/PFTBuilder.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.24 -> 1:00.99
Maximum resident set size (kbytes): 4218368 -> 2923128

flang/lib/Lower/Allocatable.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.03 -> 0:22.50
Maximum resident set size (kbytes): 3092840 -> 2116908

flang/lib/Semantics/Semantics.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:18.75 -> 1:00.20
Maximum resident set size (kbytes): 3527744 -> 2545308

While the newly added precompiled headers are as follows:

Parser:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.62
Maximum resident set size (kbytes): 1034608

Lower:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.33
Maximum resident set size (kbytes): 3615240

Semantics:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.69
Maximum resident set size (kbytes): 2403776

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:semantics flang:parser labels Mar 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2025

@llvm/pr-subscribers-flang-driver
@llvm/pr-subscribers-flang-semantics

@llvm/pr-subscribers-flang-parser

Author: Kajetan Puchalski (mrkajetanp)

Changes

Precompiling larger headers can save a lot of compile time across various compilation units.

Selected compile time & memory improvements are as follows:

flang/lib/Parser/Fortran-parsers.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:47.31 -> 0:41.68 Maximum resident set size (kbytes): 2062140 -> 1745584

flang/lib/Lower/Bridge.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:19.16 -> 0:45.86 Maximum resident set size (kbytes): 3849144 -> 2443476

flang/lib/Lower/PFTBuilder.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.24 -> 1:00.99 Maximum resident set size (kbytes): 4218368 -> 2923128

flang/lib/Lower/Allocatable.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.03 -> 0:22.50 Maximum resident set size (kbytes): 3092840 -> 2116908

flang/lib/Semantics/Semantics.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:18.75 -> 1:00.20 Maximum resident set size (kbytes): 3527744 -> 2545308

While the newly added precompiled headers are as follows:

Parser:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.62 Maximum resident set size (kbytes): 1034608

Lower:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.33 Maximum resident set size (kbytes): 3615240

Semantics:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.69 Maximum resident set size (kbytes): 2403776


Full diff: https://github.com/llvm/llvm-project/pull/131137.diff

3 Files Affected:

  • (modified) flang/lib/Lower/CMakeLists.txt (+11)
  • (modified) flang/lib/Parser/CMakeLists.txt (+8)
  • (modified) flang/lib/Semantics/CMakeLists.txt (+9)
diff --git a/flang/lib/Lower/CMakeLists.txt b/flang/lib/Lower/CMakeLists.txt
index 0bd9a47cd040f..bc817ff8f1f3e 100644
--- a/flang/lib/Lower/CMakeLists.txt
+++ b/flang/lib/Lower/CMakeLists.txt
@@ -73,3 +73,14 @@ add_flang_library(FortranLower
   MLIRLLVMDialect
   MLIRSCFToControlFlow
 )
+
+target_precompile_headers(FortranLower PRIVATE
+  [["flang/Lower/ConvertExpr.h"]]
+  [["flang/Lower/SymbolMap.h"]]
+  [["flang/Lower/AbstractConverter.h"]]
+  [["flang/Lower/IterationSpace.h"]]
+  [["flang/Lower/CallInterface.h"]]
+  [["flang/Lower/BoxAnalyzer.h"]]
+  [["flang/Lower/PFTBuilder.h"]]
+  [["flang/Lower/DirectivesCommon.h"]]
+)
diff --git a/flang/lib/Parser/CMakeLists.txt b/flang/lib/Parser/CMakeLists.txt
index 76fe3d7ce6ba4..1855b8a841ba7 100644
--- a/flang/lib/Parser/CMakeLists.txt
+++ b/flang/lib/Parser/CMakeLists.txt
@@ -36,3 +36,11 @@ add_flang_library(FortranParser
   omp_gen
   acc_gen
 )
+
+target_precompile_headers(FortranParser PRIVATE
+  [["flang/Parser/parsing.h"]]
+  [["flang/Parser/parse-tree.h"]]
+  [["flang/Parser/provenance.h"]]
+  [["flang/Parser/message.h"]]
+  [["flang/Parser/parse-tree-visitor.h"]]
+)
diff --git a/flang/lib/Semantics/CMakeLists.txt b/flang/lib/Semantics/CMakeLists.txt
index 93bf0c7c5facd..bd8cc47365f06 100644
--- a/flang/lib/Semantics/CMakeLists.txt
+++ b/flang/lib/Semantics/CMakeLists.txt
@@ -64,3 +64,12 @@ add_flang_library(FortranSemantics
   FrontendOpenACC
   TargetParser
 )
+
+target_precompile_headers(FortranSemantics PRIVATE
+  [["flang/Semantics/semantics.h"]]
+  [["flang/Semantics/type.h"]]
+  [["flang/Semantics/openmp-modifiers.h"]]
+  [["flang/Semantics/expression.h"]]
+  [["flang/Semantics/tools.h"]]
+  [["flang/Semantics/symbol.h"]]
+)

@jeanPerier
Copy link
Contributor

Lowering depends on Semantics and the Parser, and Semantics depends on Parser, have you tried using REUSE_FROM https://cmake.org/cmake/help/latest/command/target_precompile_headers.html#reusing-precompile-headers so that lowering can reuse the precompiled headers too?

Just curious, I am not familiar with that cmake feature at all.

@tblah tblah requested review from mgorny and jeanPerier March 13, 2025 15:06
@mrkajetanp
Copy link
Contributor Author

mrkajetanp commented Mar 13, 2025

Lowering depends on Semantics and the Parser, and Semantics depends on Parser, have you tried using REUSE_FROM

We shortly discussed this in a subthread under #130600.
In short, REUSE_FROM is an all-or-nothing kind of situation. With the feature it's not possible to reuse a different precompiled header unit and also add extra ones at the same time, you either re-use an existing one or have to build a new one completely from scratch.

Since the main goal is to reduce the memory usage per compilation thread, I didn't want to put too much into the precompiled headers either. If I tried adding the same Semantics headers on top of the Lower ones then the precompiled header for Lower/ would become probably the biggest TU in all of flang.

@klausler
Copy link
Contributor

Why not Evaluate? It might benefit the most of all of these directories.

@mrkajetanp
Copy link
Contributor Author

Why not Evaluate? It might benefit the most of all of these directories.

Thanks for the suggestion! No reason not to indeed, I'll add it.

@jeanPerier
Copy link
Contributor

What is the impact on the size of the build directories? (I am asking because this blog post is pointing out that the build directory size can increase).

@mrkajetanp
Copy link
Contributor Author

What is the impact on the size of the build directories?

From du -h:
440M lib/Lower/CMakeFiles/FortranLower.dir/cmake_pch.hxx.pch
122M lib/Parser/CMakeFiles/FortranParser.dir/cmake_pch.hxx.pch
278M lib/Semantics/CMakeFiles/FortranSemantics.dir/cmake_pch.hxx.pch
164M lib/Evaluate/CMakeFiles/FortranEvaluate.dir/cmake_pch.hxx.pch

But then at the same time I think some space might be saved in the specific compilation units that now no longer need to include them, so it might not be the biggest deal.

@Meinersbur
Copy link
Member

Meinersbur commented Mar 14, 2025

Consider that this does not work together with ccache: #130600 (comment)

@mrkajetanp mrkajetanp changed the title [flang] Use precompiled headers in Lower, Parser and Semantics [flang] Use precompiled headers in Lower, Parser, Semantics and Evaluate Mar 18, 2025
Copy link
Member

@mgorny mgorny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I can confirm, that (combined with #131397) I can build Flang with -j6 — and basically it'd go with -j12 if not for a few libraries.

Copy link
Member

@DavidTruby DavidTruby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

mrkajetanp added a commit that referenced this pull request Apr 8, 2025
Reverts #130600

Reverting on account of Windows issues with ccache, will bring it back
along with #131137 once those are resolved.
@mrkajetanp mrkajetanp force-pushed the precompiled-headers-2 branch from ae83e1f to 5c706ec Compare April 8, 2025 14:40
@mrkajetanp
Copy link
Contributor Author

I reverted #130600 as it was causing some buildbot issues in the absence of the ccache fixes.
I've now added the changes in that PR here and rebased this so all the PCH can be merged together once the ccache issues are resolved.

@mrkajetanp mrkajetanp changed the title [flang] Use precompiled headers in Lower, Parser, Semantics and Evaluate [flang] Use precompiled headers in Frontend, Lower, Parser, Semantics and Evaluate Apr 8, 2025
Precompiling larger headers can save a lot of compile time across
various compilation units.

Selected compile time & memory improvements are as follows:

flang/lib/Parser/Fortran-parsers.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:47.31 -> 0:41.68
Maximum resident set size (kbytes): 2062140 -> 1745584

flang/lib/Lower/Bridge.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:19.16 -> 0:45.86
Maximum resident set size (kbytes): 3849144 -> 2443476

flang/lib/Lower/PFTBuilder.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.24 -> 1:00.99
Maximum resident set size (kbytes): 4218368 -> 2923128

flang/lib/Lower/Allocatable.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.03 -> 0:22.50
Maximum resident set size (kbytes): 3092840 -> 2116908

flang/lib/Semantics/Semantics.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:18.75 -> 1:00.20
Maximum resident set size (kbytes): 3527744 -> 2545308

While the newly added precompiled headers are as follows:

Parser:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.62
Maximum resident set size (kbytes): 1034608

Lower:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.33
Maximum resident set size (kbytes): 3615240

Semantics:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.69
Maximum resident set size (kbytes): 2403776

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
flang/lib/Evaluate/check-expression.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:11.91 -> 1:02.29
Maximum resident set size (kbytes): 2710788 -> 2414740

Similar improvements for other compilation units under Evaluate.

cmake_pch.hxx.cxx compilation time:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.93
Maximum resident set size (kbytes): 1492744

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Reland of llvm#130600 which was reverted on account of waiting for required
ccache compatibility fixes.

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
@mrkajetanp mrkajetanp force-pushed the precompiled-headers-2 branch from 5c706ec to 6edd33b Compare April 24, 2025 20:00
@mrkajetanp
Copy link
Contributor Author

With #136856 merged there is now proper ccache support for this on Linux. I added a line here to disable pch on Windows for the time being, meaning this should now be ready to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants