Skip to content

[DTLTO][LLD][ELF] Add support for Integrated Distributed ThinLTO #142757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bd1976bris
Copy link
Collaborator

This PR introduces support for Integrated Distributed ThinLTO (DTLTO) in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via external distribution systems, such as Incredibuild, during the traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler driver (e.g., Clang) rather than calling LLD directly. A Clang-side interface for DTLTO will be added in a follow-up patch.

Note: Bitcode members of non-thin archives are not currently supported. This will be addressed in a future change.

Testing:

  • ELF LLD lit test coverage has been added, using a mock distributor to avoid requiring Clang.
  • Cross-project lit tests cover integration with Clang.

For the design discussion of the DTLTO feature, see: #126654.

@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2025

@llvm/pr-subscribers-lld

Author: bd1976bris (bd1976bris)

Changes

This PR introduces support for Integrated Distributed ThinLTO (DTLTO) in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via external distribution systems, such as Incredibuild, during the traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler driver (e.g., Clang) rather than calling LLD directly. A Clang-side interface for DTLTO will be added in a follow-up patch.

Note: Bitcode members of non-thin archives are not currently supported. This will be addressed in a future change.

Testing:

  • ELF LLD lit test coverage has been added, using a mock distributor to avoid requiring Clang.
  • Cross-project lit tests cover integration with Clang.

For the design discussion of the DTLTO feature, see: #126654.


Patch is 26.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142757.diff

20 Files Affected:

  • (modified) cross-project-tests/CMakeLists.txt (+10-2)
  • (added) cross-project-tests/dtlto/README.md (+3)
  • (added) cross-project-tests/dtlto/archive-thin.test (+68)
  • (added) cross-project-tests/dtlto/dtlto.c (+35)
  • (added) cross-project-tests/dtlto/lit.local.cfg (+2)
  • (modified) cross-project-tests/lit.cfg.py (+3-1)
  • (modified) lld/ELF/Config.h (+4)
  • (modified) lld/ELF/Driver.cpp (+5)
  • (modified) lld/ELF/InputFiles.cpp (+40-4)
  • (modified) lld/ELF/LTO.cpp (+8)
  • (modified) lld/ELF/Options.td (+6-1)
  • (added) lld/docs/DTLTO.rst (+40)
  • (modified) lld/docs/index.rst (+1)
  • (added) lld/test/ELF/dtlto/archive-thin.text (+84)
  • (added) lld/test/ELF/dtlto/imports.test (+53)
  • (added) lld/test/ELF/dtlto/index.test (+44)
  • (added) lld/test/ELF/dtlto/options.test (+37)
  • (added) lld/test/ELF/dtlto/partitions.test (+61)
  • (added) lld/test/ELF/dtlto/save-temps.test (+39)
  • (modified) lld/test/lit.cfg.py (+1)
diff --git a/cross-project-tests/CMakeLists.txt b/cross-project-tests/CMakeLists.txt
index 7f2fee48fda77..192db87043177 100644
--- a/cross-project-tests/CMakeLists.txt
+++ b/cross-project-tests/CMakeLists.txt
@@ -19,11 +19,12 @@ set(CROSS_PROJECT_TEST_DEPS
   FileCheck
   check-gdb-llvm-support
   count
-  llvm-dwarfdump
+  llvm-ar
   llvm-config
+  llvm-dwarfdump
   llvm-objdump
-  split-file
   not
+  split-file
   )
 
 if ("clang" IN_LIST LLVM_ENABLE_PROJECTS)
@@ -94,6 +95,13 @@ add_lit_testsuite(check-cross-amdgpu "Running AMDGPU cross-project tests"
   DEPENDS clang
   )
 
+# DTLTO tests.
+add_lit_testsuite(check-cross-dtlto "Running DTLTO cross-project tests"
+  ${CMAKE_CURRENT_BINARY_DIR}/dtlto
+  EXCLUDE_FROM_CHECK_ALL
+  DEPENDS ${CROSS_PROJECT_TEST_DEPS}
+  )
+
 # Add check-cross-project-* targets.
 add_lit_testsuites(CROSS_PROJECT ${CMAKE_CURRENT_SOURCE_DIR}
   DEPENDS ${CROSS_PROJECT_TEST_DEPS}
diff --git a/cross-project-tests/dtlto/README.md b/cross-project-tests/dtlto/README.md
new file mode 100644
index 0000000000000..12f9aa19b0d9b
--- /dev/null
+++ b/cross-project-tests/dtlto/README.md
@@ -0,0 +1,3 @@
+Tests for DTLTO (Integrated Distributed ThinLTO) functionality.
+
+These are integration tests as DTLTO invokes `clang` for code-generation.
\ No newline at end of file
diff --git a/cross-project-tests/dtlto/archive-thin.test b/cross-project-tests/dtlto/archive-thin.test
new file mode 100644
index 0000000000000..00d64de1576db
--- /dev/null
+++ b/cross-project-tests/dtlto/archive-thin.test
@@ -0,0 +1,68 @@
+# REQUIRES: x86-registered-target,ld.lld,llvm-ar
+
+# Test that a DTLTO link succeeds and outputs the expected set of files
+# correctly when thin archives are present.
+
+RUN: rm -rf %t.dir && split-file %s %t.dir && cd %t.dir
+RUN: %clang --target=x86_64-linux-gnu -c foo.c -o foo.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin bar.c -o bar.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin dog.c -o dog.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin cat.c -o cat.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin _start.c -o _start.o
+
+RUN: llvm-ar rcs foo.a foo.o --thin
+# Create this bitcode thin archive in a sub-directory to test the expansion of
+# the path to a bitcode file which is referenced using "..", e.g. in this case
+# "../bar.o". The ".." should be collapsed in any expansion to avoid
+# referencing an unknown directory on the remote side.
+RUN: mkdir lib
+RUN: llvm-ar rcs lib/bar.a bar.o --thin
+# Create this bitcode thin archive with an absolute path entry containing "..".
+RUN: llvm-ar rcs dog.a %t.dir/lib/../dog.o --thin
+RUN: llvm-ar rcs cat.a cat.o --thin
+RUN: llvm-ar rcs _start.a _start.o --thin
+
+RUN: mkdir %t.dir/out && cd %t.dir/out
+
+RUN: ld.lld %t.dir/foo.a %t.dir/lib/bar.a ../_start.a %t.dir/cat.a \
+RUN:   --whole-archive ../dog.a \
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/local.py \
+RUN:   --thinlto-remote-compiler=%clang \
+RUN:   --save-temps
+
+# Check that the required output files have been created.
+RUN: ls | FileCheck %s --check-prefix=OUTPUTS \
+RUN:     --implicit-check-not=cat --implicit-check-not=foo
+
+# The DTLTO backend emits the JSON jobs description and summary shards.
+OUTPUTS-DAG: a.{{[0-9]+}}.dist-file.json
+OUTPUTS-DAG: bar.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+OUTPUTS-DAG: dog.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+OUTPUTS-DAG: _start.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+# Native output object files.
+OUTPUTS-DAG: bar.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+OUTPUTS-DAG: dog.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+OUTPUTS-DAG: _start.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+
+# Check that bar.o and dog.o are not referenced using "..".
+RUN: not grep '\.\.\(/\|\\\\\)\(bar\|dog\)\.o' a.*.dist-file.json
+
+#--- foo.c
+__attribute__((retain)) void foo() {}
+
+#--- bar.c
+extern void foo();
+__attribute__((retain)) void bar() { foo(); }
+
+#--- dog.c
+__attribute__((retain)) void dog() {}
+
+#--- cat.c
+__attribute__((retain)) void cat() {}
+
+#--- _start.c
+extern void bar();
+__attribute__((retain)) void _start() {
+  bar();
+}
diff --git a/cross-project-tests/dtlto/dtlto.c b/cross-project-tests/dtlto/dtlto.c
new file mode 100644
index 0000000000000..191dedd801430
--- /dev/null
+++ b/cross-project-tests/dtlto/dtlto.c
@@ -0,0 +1,35 @@
+// REQUIRES: x86-registered-target,ld.lld
+
+/// Simple test that DTLTO works with a single input bitcode file and that
+/// --save-temps can be applied to the remote compilation.
+// RUN: rm -rf %t && mkdir %t && cd %t
+
+// RUN: %clang --target=x86_64-linux-gnu -c %s -o dtlto.bc -flto=thin
+
+// RUN: ld.lld %t/dtlto.bc \
+// RUN:   --thinlto-distributor=%python \
+// RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/local.py \
+// RUN:   --thinlto-remote-compiler=%clang \
+// RUN:   --thinlto-remote-compiler-arg=--save-temps
+
+/// Check that the required output files have been created.
+// RUN: ls | count 10
+// RUN: ls | FileCheck %s
+
+/// Produced by the bitcode compilation.
+// CHECK-DAG: {{^}}dtlto.bc{{$}}
+
+/// Linked ELF.
+// CHECK-DAG: {{^}}a.out{{$}}
+
+/// --save-temps output for the backend compilation.
+// CHECK-DAG: {{^}}dtlto.s{{$}}
+// CHECK-DAG: {{^}}dtlto.s.0.preopt.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.1.promote.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.2.internalize.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.3.import.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.4.opt.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.5.precodegen.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.resolution.txt{{$}}
+
+int _start() { return 0; }
diff --git a/cross-project-tests/dtlto/lit.local.cfg b/cross-project-tests/dtlto/lit.local.cfg
new file mode 100644
index 0000000000000..1b39734ad184a
--- /dev/null
+++ b/cross-project-tests/dtlto/lit.local.cfg
@@ -0,0 +1,2 @@
+if any(feature not in config.available_features for feature in ["clang"]):
+    config.unsupported = True
diff --git a/cross-project-tests/lit.cfg.py b/cross-project-tests/lit.cfg.py
index b35c643ac898c..ac27753472646 100644
--- a/cross-project-tests/lit.cfg.py
+++ b/cross-project-tests/lit.cfg.py
@@ -19,7 +19,7 @@
 config.test_format = lit.formats.ShTest(not llvm_config.use_lit_shell)
 
 # suffixes: A list of file extensions to treat as test files.
-config.suffixes = [".c", ".cl", ".cpp", ".m"]
+config.suffixes = [".c", ".cl", ".cpp", ".m", ".test"]
 
 # excludes: A list of directories to exclude from the testsuite. The 'Inputs'
 # subdirectories contain auxiliary inputs for various tests in their parent
@@ -107,6 +107,8 @@ def get_required_attr(config, attr_name):
 if lldb_path is not None:
     config.available_features.add("lldb")
 
+if llvm_config.use_llvm_tool("llvm-ar"):
+    config.available_features.add("llvm-ar")
 
 def configure_dexter_substitutions():
     """Configure substitutions for host platform and return list of dependencies"""
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index f0e9592d85dd6..5e0fb9c9b00ab 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -249,6 +249,10 @@ struct Config {
   llvm::SmallVector<llvm::StringRef, 0> searchPaths;
   llvm::SmallVector<llvm::StringRef, 0> symbolOrderingFile;
   llvm::SmallVector<llvm::StringRef, 0> thinLTOModulesToCompile;
+  llvm::StringRef dtltoDistributor;
+  llvm::SmallVector<llvm::StringRef, 0> dtltoDistributorArgs;
+  llvm::StringRef dtltoCompiler;
+  llvm::SmallVector<llvm::StringRef, 0> dtltoCompilerArgs;
   llvm::SmallVector<llvm::StringRef, 0> undefined;
   llvm::SmallVector<SymbolVersion, 0> dynamicList;
   llvm::SmallVector<uint8_t, 0> buildIdVector;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6150fe072156f..e89eaef27d6bd 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1341,6 +1341,11 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
       args.hasFlag(OPT_dependent_libraries, OPT_no_dependent_libraries, true);
   ctx.arg.disableVerify = args.hasArg(OPT_disable_verify);
   ctx.arg.discard = getDiscard(args);
+  ctx.arg.dtltoDistributor = args.getLastArgValue(OPT_thinlto_distributor_eq);
+  ctx.arg.dtltoDistributorArgs =
+      args::getStrings(args, OPT_thinlto_distributor_arg);
+  ctx.arg.dtltoCompiler = args.getLastArgValue(OPT_thinlto_compiler_eq);
+  ctx.arg.dtltoCompilerArgs = args::getStrings(args, OPT_thinlto_compiler_arg);
   ctx.arg.dwoDir = args.getLastArgValue(OPT_plugin_opt_dwo_dir_eq);
   ctx.arg.dynamicLinker = getDynamicLinker(ctx, args);
   ctx.arg.ehFrameHdr =
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index 12a77736aba7f..124e2032be69e 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -20,6 +20,7 @@
 #include "llvm/ADT/CachedHashString.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/LTO/LTO.h"
+#include "llvm/Object/Archive.h"
 #include "llvm/Object/IRObjectFile.h"
 #include "llvm/Support/ARMAttributeParser.h"
 #include "llvm/Support/ARMBuildAttributes.h"
@@ -1739,6 +1740,38 @@ static uint8_t getOsAbi(const Triple &t) {
   }
 }
 
+// For DTLTO, bitcode member names must be a valid path to a bitcode file on
+// disk. For thin archives, adjust `memberPath` to the full file path of the
+// archive member. Returns true if an adjustment was made; false otherwise.
+// Non-thin archives are not yet supported.
+static bool dtltoAdjustMemberPathIfThinArchive(Ctx &ctx, StringRef archivePath,
+                                               std::string &memberPath) {
+  assert(!archivePath.empty());
+  assert(!ctx.arg.dtltoDistributor.empty());
+
+  // Check if the archive file is a thin archive by reading its header.
+  auto memBufferOrError =
+      MemoryBuffer::getFileSlice(archivePath, sizeof(ThinArchiveMagic) - 1, 0);
+  if (std::error_code ec = memBufferOrError.getError()) {
+    ErrAlways(ctx) << "cannot open " << archivePath << ": " << ec.message();
+    return false;
+  }
+  MemoryBufferRef memBufRef = *memBufferOrError.get();
+  if (!memBufRef.getBuffer().starts_with(ThinArchiveMagic))
+    return false;
+
+  SmallString<64> archiveMemberPath;
+  if (path::is_relative(memberPath)) {
+    archiveMemberPath = path::parent_path(archivePath);
+    path::append(archiveMemberPath, memberPath);
+  } else
+    archiveMemberPath = memberPath;
+
+  path::remove_dots(archiveMemberPath, /*remove_dot_dot=*/true);
+  memberPath = archiveMemberPath.str();
+  return true;
+}
+
 BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
                          uint64_t offsetInArchive, bool lazy)
     : InputFile(ctx, BitcodeKind, mb) {
@@ -1756,10 +1789,13 @@ BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
   // symbols later in the link stage). So we append file offset to make
   // filename unique.
   StringSaver &ss = ctx.saver;
-  StringRef name = archiveName.empty()
-                       ? ss.save(path)
-                       : ss.save(archiveName + "(" + path::filename(path) +
-                                 " at " + utostr(offsetInArchive) + ")");
+  StringRef name =
+      (archiveName.empty() ||
+       (!ctx.arg.dtltoDistributor.empty() &&
+        dtltoAdjustMemberPathIfThinArchive(ctx, archiveName, path)))
+          ? ss.save(path)
+          : ss.save(archiveName + "(" + path::filename(path) + " at " +
+                    utostr(offsetInArchive) + ")");
   MemoryBufferRef mbref(mb.getBuffer(), name);
 
   obj = CHECK2(lto::InputFile::create(mbref), this);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 82a7463446a94..eff3c44bc84e8 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -180,6 +180,14 @@ BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
         std::string(ctx.arg.thinLTOPrefixReplaceNew),
         std::string(ctx.arg.thinLTOPrefixReplaceNativeObject),
         ctx.arg.thinLTOEmitImportsFiles, indexFile.get(), onIndexWrite);
+  } else if (!ctx.arg.dtltoDistributor.empty() && !ctx.bitcodeFiles.empty()) {
+    backend = lto::createOutOfProcessThinBackend(
+        llvm::heavyweight_hardware_concurrency(ctx.arg.thinLTOJobs),
+        onIndexWrite, ctx.arg.thinLTOEmitIndexFiles,
+        ctx.arg.thinLTOEmitImportsFiles, ctx.arg.outputFile,
+        ctx.arg.dtltoDistributor, ctx.arg.dtltoDistributorArgs,
+        ctx.arg.dtltoCompiler, ctx.arg.dtltoCompilerArgs,
+        !ctx.arg.saveTempsArgs.empty());
   } else {
     backend = lto::createInProcessThinBackend(
         llvm::heavyweight_hardware_concurrency(ctx.arg.thinLTOJobs),
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index c795147eb9662..f4fc24b9ca3ab 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -710,7 +710,12 @@ def thinlto_object_suffix_replace_eq: JJ<"thinlto-object-suffix-replace=">;
 def thinlto_prefix_replace_eq: JJ<"thinlto-prefix-replace=">;
 def thinlto_single_module_eq: JJ<"thinlto-single-module=">,
   HelpText<"Specify a single module to compile in ThinLTO mode, for debugging only">;
-
+def thinlto_distributor_eq: JJ<"thinlto-distributor=">,
+  HelpText<"Distributor to use for ThinLTO backend compilations. If specified, ThinLTO backend compilations will be distributed.">;
+defm thinlto_distributor_arg: EEq<"thinlto-distributor-arg", "Arguments to pass to the ThinLTO distributor">;
+def thinlto_compiler_eq: JJ<"thinlto-remote-compiler=">,
+  HelpText<"Compiler for the ThinLTO distributor to invoke for ThinLTO backend compilations">;
+defm thinlto_compiler_arg: EEq<"thinlto-remote-compiler-arg", "Compiler arguments for the ThinLTO distributor to pass for ThinLTO backend compilations">;
 defm fat_lto_objects: BB<"fat-lto-objects",
     "Use the .llvm.lto section, which contains LLVM bitcode, in fat LTO object files to perform LTO.",
     "Ignore the .llvm.lto section in relocatable object files (default).">;
diff --git a/lld/docs/DTLTO.rst b/lld/docs/DTLTO.rst
new file mode 100644
index 0000000000000..e9a090042d011
--- /dev/null
+++ b/lld/docs/DTLTO.rst
@@ -0,0 +1,40 @@
+Integrated Distributed ThinLTO (DTLTO)
+======================================
+
+Integrated Distributed ThinLTO (DTLTO) enables the distribution of backend
+ThinLTO compilations via external distribution systems, such as Incredibuild,
+during the traditional link step.
+
+The implementation is documented here: https://llvm.org/docs/DTLTO.html.
+
+Currently, DTLTO is only supported in ELF LLD. Support will be added to other
+LLD flavours in the future.
+
+ELF LLD
+-------
+
+The command-line interface is as follows:
+
+- ``--thinlto-distributor=<path>``  
+  Specifies the file to execute as the distributor process. If specified,
+  ThinLTO backend compilations will be distributed.
+
+- ``--thinlto-remote-compiler=<path>``  
+  Specifies the path to the compiler that the distributor process will use for
+  backend compilations. The compiler invoked must match the version of LLD.
+
+- ``--thinlto-distributor-arg=<arg>``  
+  Specifies ``<arg>`` on the command line when invoking the distributor.
+
+- ``--thinlto-remote-compiler-arg=<arg>``  
+  Appends ``<arg>`` to the remote compiler's command line.
+
+  Options that introduce extra input/output files may cause miscompilation if
+  the distribution system does not automatically handle pushing/fetching them to
+  remote nodes. In such cases, configure the distributor - possibly using
+  ``--thinlto-distributor-arg=`` - to manage these dependencies. See the
+  distributor documentation for details.
+
+Some LLD LTO options (e.g., ``--lto-sample-profile=<file>``) are supported.
+Currently, other options are silently accepted but do not have the intended
+effect. Support for such options will be expanded in the future.
diff --git a/lld/docs/index.rst b/lld/docs/index.rst
index 8260461c36905..69792e3b575be 100644
--- a/lld/docs/index.rst
+++ b/lld/docs/index.rst
@@ -147,3 +147,4 @@ document soon.
    ELF/start-stop-gc
    ELF/warn_backrefs
    MachO/index
+   DTLTO
diff --git a/lld/test/ELF/dtlto/archive-thin.text b/lld/test/ELF/dtlto/archive-thin.text
new file mode 100644
index 0000000000000..d5d7c11b53ebb
--- /dev/null
+++ b/lld/test/ELF/dtlto/archive-thin.text
@@ -0,0 +1,84 @@
+# REQUIRES: x86
+
+# Test that a DTLTO link succeeds and outputs the expected set of files
+# correctly when thin archives are present.
+
+RUN: rm -rf %t.dir && split-file %s %t.dir && cd %t.dir
+
+# Generate ThinLTO bitcode files.
+RUN: opt -thinlto-bc t1.ll -o t1.bc -O2
+RUN: opt -thinlto-bc t2.ll -o t2.bc -O2
+RUN: opt -thinlto-bc t3.ll -o t3.bc -O2
+
+# Generate object files for mock.py to return.
+RUN: llc t1.ll --filetype=obj -o t1.o --relocation-model=pic
+RUN: llc t2.ll --filetype=obj -o t2.o --relocation-model=pic
+RUN: llc t3.ll --filetype=obj -o t3.o --relocation-model=pic
+
+RUN: llvm-ar rcs t1.a t1.bc --thin
+# Create this bitcode thin archive in a subdirectory to test the expansion of
+# the path to a bitcode file which is referenced using "..", e.g., in this case
+# "../t2.bc". The ".." should be collapsed in any expansion to avoid
+# referencing an unknown directory on the remote side.
+RUN: mkdir lib
+RUN: llvm-ar rcs lib/t2.a t2.bc --thin
+# Create this bitcode thin archive with an absolute path entry containing "..".
+RUN: llvm-ar rcs t3.a %t.dir/lib/../t3.bc --thin
+RUN: llvm-ar rcs t4.a t1.bc --thin
+
+RUN: mkdir %t.dir/out && cd %t.dir/out
+
+# Note that mock.py does not do any compilation, instead it simply writes the
+# contents of the object files supplied on the command line into the output
+# object files in job order.
+RUN: ld.lld --whole-archive %t.dir/t1.a %t.dir/lib/t2.a ../t3.a \
+RUN:   --no-whole-archive %t.dir/t4.a \ 
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/mock.py \
+RUN:   --thinlto-distributor-arg=../t1.o \
+RUN:   --thinlto-distributor-arg=../t2.o \
+RUN:   --thinlto-distributor-arg=../t3.o \
+RUN:   --save-temps
+
+RUN: ls | FileCheck %s --implicit-check-not=4
+
+# JSON jobs description and summary shards.
+CHECK-DAG: a.{{[0-9]+}}.dist-file.json
+CHECK-DAG: t1.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+CHECK-DAG: t2.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+CHECK-DAG: t3.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+
+# Native output object files.
+CHECK-DAG: t1.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+CHECK-DAG: t2.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+CHECK-DAG: t3.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+
+# Check that t2.o and t3.o are not referenced using "..".
+# RUN: not grep '\.\.\(/\|\\\\\)\(t2\|t3\)\.bc' a.*.dist-file.json
+
+#--- t1.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t1() {
+entry:
+  ret void
+}
+
+#--- t2.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t2() {
+entry:
+  ret void
+}
+
+#--- t3.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t3() {
+entry:
+  ret void
+}
diff --git a/lld/test/ELF/dtlto/imports.test b/lld/test/ELF/dtlto/imports.test
new file mode 100644
index 0000000000000..74d5223f04e24
--- /dev/null
+++ b/lld/test/ELF/dtlto/imports.test
@@ -0,0 +1,53 @@
+# REQUIRES: x86
+
+# Check that DTLTO creates imports lists if requested.
+
+RUN: rm -rf %t && split-file %s %t && cd %t
+
+RUN: opt -thinlto-bc t1.ll -o t1.bc -O2
+RUN: opt -thinlto-bc t2.ll -o t2.bc -O2
+
+# Generate object files for mock.py to return.
+RUN: llc t1.ll --filetype=obj -o t1.o --relocation-model=pic
+RUN: llc t2.ll --filetype=obj -o t2.o --relocation-model=pic
+
+# Common command-line arguments. Note that mock.py does not do any compilation;
+# instead, it simply writes the contents of the object files supplied on the
+# command line into the output object files in job order.
+RUN: echo "t1.bc t2.bc \
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/mock.py \
+RUN:   --thinlto-distributor-arg=t1.o \
+RUN:   --thinlto-distributor-arg=t2.o" > l.rsp
+
+# Check that imports files are not created normally.
+RUN: ld.lld @l.rsp
+RUN: ls | FileCheck %s --check-prefix=NOIMPORTSFILES
+NOIMPORTSFILES-NOT: .imports
+
+# Check that imports files are created with --thinlto-emit-imports-files.
+RUN: ld.lld @l.rsp --thinlto-emit-imports-files
+RUN: ls | FileCheck %s --check-prefix=IMPORTSFILES
+IMPORTSFILES: t1.bc.imports
+IMPORTSFILES: t2.bc.imports
+
+#--- t1.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2025

@llvm/pr-subscribers-lld-elf

Author: bd1976bris (bd1976bris)

Changes

This PR introduces support for Integrated Distributed ThinLTO (DTLTO) in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via external distribution systems, such as Incredibuild, during the traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler driver (e.g., Clang) rather than calling LLD directly. A Clang-side interface for DTLTO will be added in a follow-up patch.

Note: Bitcode members of non-thin archives are not currently supported. This will be addressed in a future change.

Testing:

  • ELF LLD lit test coverage has been added, using a mock distributor to avoid requiring Clang.
  • Cross-project lit tests cover integration with Clang.

For the design discussion of the DTLTO feature, see: #126654.


Patch is 26.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142757.diff

20 Files Affected:

  • (modified) cross-project-tests/CMakeLists.txt (+10-2)
  • (added) cross-project-tests/dtlto/README.md (+3)
  • (added) cross-project-tests/dtlto/archive-thin.test (+68)
  • (added) cross-project-tests/dtlto/dtlto.c (+35)
  • (added) cross-project-tests/dtlto/lit.local.cfg (+2)
  • (modified) cross-project-tests/lit.cfg.py (+3-1)
  • (modified) lld/ELF/Config.h (+4)
  • (modified) lld/ELF/Driver.cpp (+5)
  • (modified) lld/ELF/InputFiles.cpp (+40-4)
  • (modified) lld/ELF/LTO.cpp (+8)
  • (modified) lld/ELF/Options.td (+6-1)
  • (added) lld/docs/DTLTO.rst (+40)
  • (modified) lld/docs/index.rst (+1)
  • (added) lld/test/ELF/dtlto/archive-thin.text (+84)
  • (added) lld/test/ELF/dtlto/imports.test (+53)
  • (added) lld/test/ELF/dtlto/index.test (+44)
  • (added) lld/test/ELF/dtlto/options.test (+37)
  • (added) lld/test/ELF/dtlto/partitions.test (+61)
  • (added) lld/test/ELF/dtlto/save-temps.test (+39)
  • (modified) lld/test/lit.cfg.py (+1)
diff --git a/cross-project-tests/CMakeLists.txt b/cross-project-tests/CMakeLists.txt
index 7f2fee48fda77..192db87043177 100644
--- a/cross-project-tests/CMakeLists.txt
+++ b/cross-project-tests/CMakeLists.txt
@@ -19,11 +19,12 @@ set(CROSS_PROJECT_TEST_DEPS
   FileCheck
   check-gdb-llvm-support
   count
-  llvm-dwarfdump
+  llvm-ar
   llvm-config
+  llvm-dwarfdump
   llvm-objdump
-  split-file
   not
+  split-file
   )
 
 if ("clang" IN_LIST LLVM_ENABLE_PROJECTS)
@@ -94,6 +95,13 @@ add_lit_testsuite(check-cross-amdgpu "Running AMDGPU cross-project tests"
   DEPENDS clang
   )
 
+# DTLTO tests.
+add_lit_testsuite(check-cross-dtlto "Running DTLTO cross-project tests"
+  ${CMAKE_CURRENT_BINARY_DIR}/dtlto
+  EXCLUDE_FROM_CHECK_ALL
+  DEPENDS ${CROSS_PROJECT_TEST_DEPS}
+  )
+
 # Add check-cross-project-* targets.
 add_lit_testsuites(CROSS_PROJECT ${CMAKE_CURRENT_SOURCE_DIR}
   DEPENDS ${CROSS_PROJECT_TEST_DEPS}
diff --git a/cross-project-tests/dtlto/README.md b/cross-project-tests/dtlto/README.md
new file mode 100644
index 0000000000000..12f9aa19b0d9b
--- /dev/null
+++ b/cross-project-tests/dtlto/README.md
@@ -0,0 +1,3 @@
+Tests for DTLTO (Integrated Distributed ThinLTO) functionality.
+
+These are integration tests as DTLTO invokes `clang` for code-generation.
\ No newline at end of file
diff --git a/cross-project-tests/dtlto/archive-thin.test b/cross-project-tests/dtlto/archive-thin.test
new file mode 100644
index 0000000000000..00d64de1576db
--- /dev/null
+++ b/cross-project-tests/dtlto/archive-thin.test
@@ -0,0 +1,68 @@
+# REQUIRES: x86-registered-target,ld.lld,llvm-ar
+
+# Test that a DTLTO link succeeds and outputs the expected set of files
+# correctly when thin archives are present.
+
+RUN: rm -rf %t.dir && split-file %s %t.dir && cd %t.dir
+RUN: %clang --target=x86_64-linux-gnu -c foo.c -o foo.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin bar.c -o bar.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin dog.c -o dog.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin cat.c -o cat.o
+RUN: %clang --target=x86_64-linux-gnu -c -flto=thin _start.c -o _start.o
+
+RUN: llvm-ar rcs foo.a foo.o --thin
+# Create this bitcode thin archive in a sub-directory to test the expansion of
+# the path to a bitcode file which is referenced using "..", e.g. in this case
+# "../bar.o". The ".." should be collapsed in any expansion to avoid
+# referencing an unknown directory on the remote side.
+RUN: mkdir lib
+RUN: llvm-ar rcs lib/bar.a bar.o --thin
+# Create this bitcode thin archive with an absolute path entry containing "..".
+RUN: llvm-ar rcs dog.a %t.dir/lib/../dog.o --thin
+RUN: llvm-ar rcs cat.a cat.o --thin
+RUN: llvm-ar rcs _start.a _start.o --thin
+
+RUN: mkdir %t.dir/out && cd %t.dir/out
+
+RUN: ld.lld %t.dir/foo.a %t.dir/lib/bar.a ../_start.a %t.dir/cat.a \
+RUN:   --whole-archive ../dog.a \
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/local.py \
+RUN:   --thinlto-remote-compiler=%clang \
+RUN:   --save-temps
+
+# Check that the required output files have been created.
+RUN: ls | FileCheck %s --check-prefix=OUTPUTS \
+RUN:     --implicit-check-not=cat --implicit-check-not=foo
+
+# The DTLTO backend emits the JSON jobs description and summary shards.
+OUTPUTS-DAG: a.{{[0-9]+}}.dist-file.json
+OUTPUTS-DAG: bar.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+OUTPUTS-DAG: dog.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+OUTPUTS-DAG: _start.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+# Native output object files.
+OUTPUTS-DAG: bar.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+OUTPUTS-DAG: dog.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+OUTPUTS-DAG: _start.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+
+# Check that bar.o and dog.o are not referenced using "..".
+RUN: not grep '\.\.\(/\|\\\\\)\(bar\|dog\)\.o' a.*.dist-file.json
+
+#--- foo.c
+__attribute__((retain)) void foo() {}
+
+#--- bar.c
+extern void foo();
+__attribute__((retain)) void bar() { foo(); }
+
+#--- dog.c
+__attribute__((retain)) void dog() {}
+
+#--- cat.c
+__attribute__((retain)) void cat() {}
+
+#--- _start.c
+extern void bar();
+__attribute__((retain)) void _start() {
+  bar();
+}
diff --git a/cross-project-tests/dtlto/dtlto.c b/cross-project-tests/dtlto/dtlto.c
new file mode 100644
index 0000000000000..191dedd801430
--- /dev/null
+++ b/cross-project-tests/dtlto/dtlto.c
@@ -0,0 +1,35 @@
+// REQUIRES: x86-registered-target,ld.lld
+
+/// Simple test that DTLTO works with a single input bitcode file and that
+/// --save-temps can be applied to the remote compilation.
+// RUN: rm -rf %t && mkdir %t && cd %t
+
+// RUN: %clang --target=x86_64-linux-gnu -c %s -o dtlto.bc -flto=thin
+
+// RUN: ld.lld %t/dtlto.bc \
+// RUN:   --thinlto-distributor=%python \
+// RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/local.py \
+// RUN:   --thinlto-remote-compiler=%clang \
+// RUN:   --thinlto-remote-compiler-arg=--save-temps
+
+/// Check that the required output files have been created.
+// RUN: ls | count 10
+// RUN: ls | FileCheck %s
+
+/// Produced by the bitcode compilation.
+// CHECK-DAG: {{^}}dtlto.bc{{$}}
+
+/// Linked ELF.
+// CHECK-DAG: {{^}}a.out{{$}}
+
+/// --save-temps output for the backend compilation.
+// CHECK-DAG: {{^}}dtlto.s{{$}}
+// CHECK-DAG: {{^}}dtlto.s.0.preopt.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.1.promote.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.2.internalize.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.3.import.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.4.opt.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.5.precodegen.bc{{$}}
+// CHECK-DAG: {{^}}dtlto.s.resolution.txt{{$}}
+
+int _start() { return 0; }
diff --git a/cross-project-tests/dtlto/lit.local.cfg b/cross-project-tests/dtlto/lit.local.cfg
new file mode 100644
index 0000000000000..1b39734ad184a
--- /dev/null
+++ b/cross-project-tests/dtlto/lit.local.cfg
@@ -0,0 +1,2 @@
+if any(feature not in config.available_features for feature in ["clang"]):
+    config.unsupported = True
diff --git a/cross-project-tests/lit.cfg.py b/cross-project-tests/lit.cfg.py
index b35c643ac898c..ac27753472646 100644
--- a/cross-project-tests/lit.cfg.py
+++ b/cross-project-tests/lit.cfg.py
@@ -19,7 +19,7 @@
 config.test_format = lit.formats.ShTest(not llvm_config.use_lit_shell)
 
 # suffixes: A list of file extensions to treat as test files.
-config.suffixes = [".c", ".cl", ".cpp", ".m"]
+config.suffixes = [".c", ".cl", ".cpp", ".m", ".test"]
 
 # excludes: A list of directories to exclude from the testsuite. The 'Inputs'
 # subdirectories contain auxiliary inputs for various tests in their parent
@@ -107,6 +107,8 @@ def get_required_attr(config, attr_name):
 if lldb_path is not None:
     config.available_features.add("lldb")
 
+if llvm_config.use_llvm_tool("llvm-ar"):
+    config.available_features.add("llvm-ar")
 
 def configure_dexter_substitutions():
     """Configure substitutions for host platform and return list of dependencies"""
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index f0e9592d85dd6..5e0fb9c9b00ab 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -249,6 +249,10 @@ struct Config {
   llvm::SmallVector<llvm::StringRef, 0> searchPaths;
   llvm::SmallVector<llvm::StringRef, 0> symbolOrderingFile;
   llvm::SmallVector<llvm::StringRef, 0> thinLTOModulesToCompile;
+  llvm::StringRef dtltoDistributor;
+  llvm::SmallVector<llvm::StringRef, 0> dtltoDistributorArgs;
+  llvm::StringRef dtltoCompiler;
+  llvm::SmallVector<llvm::StringRef, 0> dtltoCompilerArgs;
   llvm::SmallVector<llvm::StringRef, 0> undefined;
   llvm::SmallVector<SymbolVersion, 0> dynamicList;
   llvm::SmallVector<uint8_t, 0> buildIdVector;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6150fe072156f..e89eaef27d6bd 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1341,6 +1341,11 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
       args.hasFlag(OPT_dependent_libraries, OPT_no_dependent_libraries, true);
   ctx.arg.disableVerify = args.hasArg(OPT_disable_verify);
   ctx.arg.discard = getDiscard(args);
+  ctx.arg.dtltoDistributor = args.getLastArgValue(OPT_thinlto_distributor_eq);
+  ctx.arg.dtltoDistributorArgs =
+      args::getStrings(args, OPT_thinlto_distributor_arg);
+  ctx.arg.dtltoCompiler = args.getLastArgValue(OPT_thinlto_compiler_eq);
+  ctx.arg.dtltoCompilerArgs = args::getStrings(args, OPT_thinlto_compiler_arg);
   ctx.arg.dwoDir = args.getLastArgValue(OPT_plugin_opt_dwo_dir_eq);
   ctx.arg.dynamicLinker = getDynamicLinker(ctx, args);
   ctx.arg.ehFrameHdr =
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index 12a77736aba7f..124e2032be69e 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -20,6 +20,7 @@
 #include "llvm/ADT/CachedHashString.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/LTO/LTO.h"
+#include "llvm/Object/Archive.h"
 #include "llvm/Object/IRObjectFile.h"
 #include "llvm/Support/ARMAttributeParser.h"
 #include "llvm/Support/ARMBuildAttributes.h"
@@ -1739,6 +1740,38 @@ static uint8_t getOsAbi(const Triple &t) {
   }
 }
 
+// For DTLTO, bitcode member names must be a valid path to a bitcode file on
+// disk. For thin archives, adjust `memberPath` to the full file path of the
+// archive member. Returns true if an adjustment was made; false otherwise.
+// Non-thin archives are not yet supported.
+static bool dtltoAdjustMemberPathIfThinArchive(Ctx &ctx, StringRef archivePath,
+                                               std::string &memberPath) {
+  assert(!archivePath.empty());
+  assert(!ctx.arg.dtltoDistributor.empty());
+
+  // Check if the archive file is a thin archive by reading its header.
+  auto memBufferOrError =
+      MemoryBuffer::getFileSlice(archivePath, sizeof(ThinArchiveMagic) - 1, 0);
+  if (std::error_code ec = memBufferOrError.getError()) {
+    ErrAlways(ctx) << "cannot open " << archivePath << ": " << ec.message();
+    return false;
+  }
+  MemoryBufferRef memBufRef = *memBufferOrError.get();
+  if (!memBufRef.getBuffer().starts_with(ThinArchiveMagic))
+    return false;
+
+  SmallString<64> archiveMemberPath;
+  if (path::is_relative(memberPath)) {
+    archiveMemberPath = path::parent_path(archivePath);
+    path::append(archiveMemberPath, memberPath);
+  } else
+    archiveMemberPath = memberPath;
+
+  path::remove_dots(archiveMemberPath, /*remove_dot_dot=*/true);
+  memberPath = archiveMemberPath.str();
+  return true;
+}
+
 BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
                          uint64_t offsetInArchive, bool lazy)
     : InputFile(ctx, BitcodeKind, mb) {
@@ -1756,10 +1789,13 @@ BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
   // symbols later in the link stage). So we append file offset to make
   // filename unique.
   StringSaver &ss = ctx.saver;
-  StringRef name = archiveName.empty()
-                       ? ss.save(path)
-                       : ss.save(archiveName + "(" + path::filename(path) +
-                                 " at " + utostr(offsetInArchive) + ")");
+  StringRef name =
+      (archiveName.empty() ||
+       (!ctx.arg.dtltoDistributor.empty() &&
+        dtltoAdjustMemberPathIfThinArchive(ctx, archiveName, path)))
+          ? ss.save(path)
+          : ss.save(archiveName + "(" + path::filename(path) + " at " +
+                    utostr(offsetInArchive) + ")");
   MemoryBufferRef mbref(mb.getBuffer(), name);
 
   obj = CHECK2(lto::InputFile::create(mbref), this);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 82a7463446a94..eff3c44bc84e8 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -180,6 +180,14 @@ BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
         std::string(ctx.arg.thinLTOPrefixReplaceNew),
         std::string(ctx.arg.thinLTOPrefixReplaceNativeObject),
         ctx.arg.thinLTOEmitImportsFiles, indexFile.get(), onIndexWrite);
+  } else if (!ctx.arg.dtltoDistributor.empty() && !ctx.bitcodeFiles.empty()) {
+    backend = lto::createOutOfProcessThinBackend(
+        llvm::heavyweight_hardware_concurrency(ctx.arg.thinLTOJobs),
+        onIndexWrite, ctx.arg.thinLTOEmitIndexFiles,
+        ctx.arg.thinLTOEmitImportsFiles, ctx.arg.outputFile,
+        ctx.arg.dtltoDistributor, ctx.arg.dtltoDistributorArgs,
+        ctx.arg.dtltoCompiler, ctx.arg.dtltoCompilerArgs,
+        !ctx.arg.saveTempsArgs.empty());
   } else {
     backend = lto::createInProcessThinBackend(
         llvm::heavyweight_hardware_concurrency(ctx.arg.thinLTOJobs),
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index c795147eb9662..f4fc24b9ca3ab 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -710,7 +710,12 @@ def thinlto_object_suffix_replace_eq: JJ<"thinlto-object-suffix-replace=">;
 def thinlto_prefix_replace_eq: JJ<"thinlto-prefix-replace=">;
 def thinlto_single_module_eq: JJ<"thinlto-single-module=">,
   HelpText<"Specify a single module to compile in ThinLTO mode, for debugging only">;
-
+def thinlto_distributor_eq: JJ<"thinlto-distributor=">,
+  HelpText<"Distributor to use for ThinLTO backend compilations. If specified, ThinLTO backend compilations will be distributed.">;
+defm thinlto_distributor_arg: EEq<"thinlto-distributor-arg", "Arguments to pass to the ThinLTO distributor">;
+def thinlto_compiler_eq: JJ<"thinlto-remote-compiler=">,
+  HelpText<"Compiler for the ThinLTO distributor to invoke for ThinLTO backend compilations">;
+defm thinlto_compiler_arg: EEq<"thinlto-remote-compiler-arg", "Compiler arguments for the ThinLTO distributor to pass for ThinLTO backend compilations">;
 defm fat_lto_objects: BB<"fat-lto-objects",
     "Use the .llvm.lto section, which contains LLVM bitcode, in fat LTO object files to perform LTO.",
     "Ignore the .llvm.lto section in relocatable object files (default).">;
diff --git a/lld/docs/DTLTO.rst b/lld/docs/DTLTO.rst
new file mode 100644
index 0000000000000..e9a090042d011
--- /dev/null
+++ b/lld/docs/DTLTO.rst
@@ -0,0 +1,40 @@
+Integrated Distributed ThinLTO (DTLTO)
+======================================
+
+Integrated Distributed ThinLTO (DTLTO) enables the distribution of backend
+ThinLTO compilations via external distribution systems, such as Incredibuild,
+during the traditional link step.
+
+The implementation is documented here: https://llvm.org/docs/DTLTO.html.
+
+Currently, DTLTO is only supported in ELF LLD. Support will be added to other
+LLD flavours in the future.
+
+ELF LLD
+-------
+
+The command-line interface is as follows:
+
+- ``--thinlto-distributor=<path>``  
+  Specifies the file to execute as the distributor process. If specified,
+  ThinLTO backend compilations will be distributed.
+
+- ``--thinlto-remote-compiler=<path>``  
+  Specifies the path to the compiler that the distributor process will use for
+  backend compilations. The compiler invoked must match the version of LLD.
+
+- ``--thinlto-distributor-arg=<arg>``  
+  Specifies ``<arg>`` on the command line when invoking the distributor.
+
+- ``--thinlto-remote-compiler-arg=<arg>``  
+  Appends ``<arg>`` to the remote compiler's command line.
+
+  Options that introduce extra input/output files may cause miscompilation if
+  the distribution system does not automatically handle pushing/fetching them to
+  remote nodes. In such cases, configure the distributor - possibly using
+  ``--thinlto-distributor-arg=`` - to manage these dependencies. See the
+  distributor documentation for details.
+
+Some LLD LTO options (e.g., ``--lto-sample-profile=<file>``) are supported.
+Currently, other options are silently accepted but do not have the intended
+effect. Support for such options will be expanded in the future.
diff --git a/lld/docs/index.rst b/lld/docs/index.rst
index 8260461c36905..69792e3b575be 100644
--- a/lld/docs/index.rst
+++ b/lld/docs/index.rst
@@ -147,3 +147,4 @@ document soon.
    ELF/start-stop-gc
    ELF/warn_backrefs
    MachO/index
+   DTLTO
diff --git a/lld/test/ELF/dtlto/archive-thin.text b/lld/test/ELF/dtlto/archive-thin.text
new file mode 100644
index 0000000000000..d5d7c11b53ebb
--- /dev/null
+++ b/lld/test/ELF/dtlto/archive-thin.text
@@ -0,0 +1,84 @@
+# REQUIRES: x86
+
+# Test that a DTLTO link succeeds and outputs the expected set of files
+# correctly when thin archives are present.
+
+RUN: rm -rf %t.dir && split-file %s %t.dir && cd %t.dir
+
+# Generate ThinLTO bitcode files.
+RUN: opt -thinlto-bc t1.ll -o t1.bc -O2
+RUN: opt -thinlto-bc t2.ll -o t2.bc -O2
+RUN: opt -thinlto-bc t3.ll -o t3.bc -O2
+
+# Generate object files for mock.py to return.
+RUN: llc t1.ll --filetype=obj -o t1.o --relocation-model=pic
+RUN: llc t2.ll --filetype=obj -o t2.o --relocation-model=pic
+RUN: llc t3.ll --filetype=obj -o t3.o --relocation-model=pic
+
+RUN: llvm-ar rcs t1.a t1.bc --thin
+# Create this bitcode thin archive in a subdirectory to test the expansion of
+# the path to a bitcode file which is referenced using "..", e.g., in this case
+# "../t2.bc". The ".." should be collapsed in any expansion to avoid
+# referencing an unknown directory on the remote side.
+RUN: mkdir lib
+RUN: llvm-ar rcs lib/t2.a t2.bc --thin
+# Create this bitcode thin archive with an absolute path entry containing "..".
+RUN: llvm-ar rcs t3.a %t.dir/lib/../t3.bc --thin
+RUN: llvm-ar rcs t4.a t1.bc --thin
+
+RUN: mkdir %t.dir/out && cd %t.dir/out
+
+# Note that mock.py does not do any compilation, instead it simply writes the
+# contents of the object files supplied on the command line into the output
+# object files in job order.
+RUN: ld.lld --whole-archive %t.dir/t1.a %t.dir/lib/t2.a ../t3.a \
+RUN:   --no-whole-archive %t.dir/t4.a \ 
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/mock.py \
+RUN:   --thinlto-distributor-arg=../t1.o \
+RUN:   --thinlto-distributor-arg=../t2.o \
+RUN:   --thinlto-distributor-arg=../t3.o \
+RUN:   --save-temps
+
+RUN: ls | FileCheck %s --implicit-check-not=4
+
+# JSON jobs description and summary shards.
+CHECK-DAG: a.{{[0-9]+}}.dist-file.json
+CHECK-DAG: t1.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+CHECK-DAG: t2.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+CHECK-DAG: t3.{{[0-9]+}}.{{[0-9]+}}.native.o.thinlto.bc{{$}}
+
+# Native output object files.
+CHECK-DAG: t1.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+CHECK-DAG: t2.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+CHECK-DAG: t3.{{[0-9]+}}.{{[0-9]+}}.native.o{{$}}
+
+# Check that t2.o and t3.o are not referenced using "..".
+# RUN: not grep '\.\.\(/\|\\\\\)\(t2\|t3\)\.bc' a.*.dist-file.json
+
+#--- t1.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t1() {
+entry:
+  ret void
+}
+
+#--- t2.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t2() {
+entry:
+  ret void
+}
+
+#--- t3.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @t3() {
+entry:
+  ret void
+}
diff --git a/lld/test/ELF/dtlto/imports.test b/lld/test/ELF/dtlto/imports.test
new file mode 100644
index 0000000000000..74d5223f04e24
--- /dev/null
+++ b/lld/test/ELF/dtlto/imports.test
@@ -0,0 +1,53 @@
+# REQUIRES: x86
+
+# Check that DTLTO creates imports lists if requested.
+
+RUN: rm -rf %t && split-file %s %t && cd %t
+
+RUN: opt -thinlto-bc t1.ll -o t1.bc -O2
+RUN: opt -thinlto-bc t2.ll -o t2.bc -O2
+
+# Generate object files for mock.py to return.
+RUN: llc t1.ll --filetype=obj -o t1.o --relocation-model=pic
+RUN: llc t2.ll --filetype=obj -o t2.o --relocation-model=pic
+
+# Common command-line arguments. Note that mock.py does not do any compilation;
+# instead, it simply writes the contents of the object files supplied on the
+# command line into the output object files in job order.
+RUN: echo "t1.bc t2.bc \
+RUN:   --thinlto-distributor=%python \
+RUN:   --thinlto-distributor-arg=%llvm_src_root/utils/dtlto/mock.py \
+RUN:   --thinlto-distributor-arg=t1.o \
+RUN:   --thinlto-distributor-arg=t2.o" > l.rsp
+
+# Check that imports files are not created normally.
+RUN: ld.lld @l.rsp
+RUN: ls | FileCheck %s --check-prefix=NOIMPORTSFILES
+NOIMPORTSFILES-NOT: .imports
+
+# Check that imports files are created with --thinlto-emit-imports-files.
+RUN: ld.lld @l.rsp --thinlto-emit-imports-files
+RUN: ls | FileCheck %s --check-prefix=IMPORTSFILES
+IMPORTSFILES: t1.bc.imports
+IMPORTSFILES: t2.bc.imports
+
+#--- t1.ll
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-...
[truncated]

@bd1976bris
Copy link
Collaborator Author

Force pushed some minor changes to ensure that all comments made previously on #126654 and #127749 have been addressed here.

This patch introduces support for Integrated Distributed ThinLTO
(DTLTO) in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via
external distribution systems, such as Incredibuild, during the
traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler
driver (e.g., Clang) rather than calling LLD directly. A Clang-side
interface for DTLTO will be added in a follow-up patch.

Note: Bitcode members of non-thin archives are not currently
supported. This will be addressed in a future change.

Testing:
- ELF LLD `lit` test coverage has been added, using a mock distributor
  to avoid requiring Clang.
- Cross-project `lit` tests cover integration with Clang.

For the design discussion of the DTLTO feature, see:
llvm#126654
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants