Skip to content

[clang][MBD] set up module build daemon infrastructure #67562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

cpsughrue
Copy link
Contributor

The module build daemon (mbd) will be a cc1 tool that helps convert implicit module command lines to explicit module command lines. The basic workflow is as follows: A clang invocation will check whether a mbd exists. If not, the invocation will spawn one. After the mbd is up and running and a handshake has successfully been carried out between the mbd and clang invocation, the clang invocation will share its command line with the mbd. The mbd will then scan the translation unit and build all modular dependencies before returning a modified cc1 command line such that all arguments related to modules are converted from the implicit option to their explicit option.

This commit sets up the basic mbd infrastructure. Including the ability to spawn a mbd and carry out a handshake between the clang invocation and mbd.

RFC: https://discourse.llvm.org/t/rfc-modules-build-daemon-build-system-agnostic-support-for-explicitly-built-modules/71524

@cpsughrue cpsughrue added the clang:modules C++20 modules and Clang Header Modules label Sep 27, 2023
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Sep 27, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2023

@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-platform-windows
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Changes

The module build daemon (mbd) will be a cc1 tool that helps convert implicit module command lines to explicit module command lines. The basic workflow is as follows: A clang invocation will check whether a mbd exists. If not, the invocation will spawn one. After the mbd is up and running and a handshake has successfully been carried out between the mbd and clang invocation, the clang invocation will share its command line with the mbd. The mbd will then scan the translation unit and build all modular dependencies before returning a modified cc1 command line such that all arguments related to modules are converted from the implicit option to their explicit option.

This commit sets up the basic mbd infrastructure. Including the ability to spawn a mbd and carry out a handshake between the clang invocation and mbd.

RFC: https://discourse.llvm.org/t/rfc-modules-build-daemon-build-system-agnostic-support-for-explicitly-built-modules/71524


Patch is 40.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67562.diff

19 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+12)
  • (modified) clang/include/clang/Frontend/FrontendOptions.h (+7)
  • (added) clang/include/clang/Tooling/ModuleBuildDaemon/Client.h (+44)
  • (added) clang/include/clang/Tooling/ModuleBuildDaemon/SocketMsgSupport.h (+134)
  • (added) clang/include/clang/Tooling/ModuleBuildDaemon/SocketSupport.h (+31)
  • (added) clang/include/clang/Tooling/ModuleBuildDaemon/Utils.h (+28)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+13-1)
  • (modified) clang/lib/Tooling/CMakeLists.txt (+1)
  • (added) clang/lib/Tooling/ModuleBuildDaemon/CMakeLists.txt (+9)
  • (added) clang/lib/Tooling/ModuleBuildDaemon/Client.cpp (+167)
  • (added) clang/lib/Tooling/ModuleBuildDaemon/SocketSupport.cpp (+128)
  • (added) clang/lib/Tooling/ModuleBuildDaemon/Utils.cpp (+32)
  • (modified) clang/test/Driver/unknown-arg.c (+1-1)
  • (added) clang/test/ModuleBuildDaemon/handshake.c (+18)
  • (added) clang/test/ModuleBuildDaemon/launch.c (+14)
  • (modified) clang/tools/driver/CMakeLists.txt (+3)
  • (modified) clang/tools/driver/cc1_main.cpp (+23-5)
  • (added) clang/tools/driver/cc1modbuildd_main.cpp (+273)
  • (modified) clang/tools/driver/driver.cpp (+15-2)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index f573f5a06ceb501..06575f4b73bd47a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2836,6 +2836,18 @@ defm declspec : BoolOption<"f", "declspec",
   NegFlag<SetFalse, [], [ClangOption], "Disallow">,
   BothFlags<[], [ClangOption, CC1Option],
           " __declspec as a keyword">>, Group<f_clang_Group>;
+
+def fmodule_build_daemon : Flag<["-"], "fmodule-build-daemon">, Group<f_Group>,
+  Flags<[NoXarchOption]>, 
+  Visibility<[ClangOption, CC1Option]>,
+  HelpText<"Enables module build daemon functionality">,
+  MarshallingInfoFlag<FrontendOpts<"ModuleBuildDaemon">>;
+def fmodule_build_daemon_EQ : Joined<["-"], "fmodule-build-daemon=">, Group<f_Group>,
+  Flags<[NoXarchOption]>,
+  Visibility<[ClangOption, CC1Option]>,
+  HelpText<"Enables module build daemon functionality and defines location of output files">,
+  MarshallingInfoString<FrontendOpts<"ModuleBuildDaemonPath">>;
+
 def fmodules_cache_path : Joined<["-"], "fmodules-cache-path=">, Group<i_Group>,
   Flags<[NoXarchOption]>, Visibility<[ClangOption, CC1Option]>,
   MetaVarName<"<directory>">,
diff --git a/clang/include/clang/Frontend/FrontendOptions.h b/clang/include/clang/Frontend/FrontendOptions.h
index 117e35de6f76c4c..8ce97a57d413c0b 100644
--- a/clang/include/clang/Frontend/FrontendOptions.h
+++ b/clang/include/clang/Frontend/FrontendOptions.h
@@ -350,6 +350,9 @@ class FrontendOptions {
   /// Whether to share the FileManager when building modules.
   unsigned ModulesShareFileManager : 1;
 
+  /// Connect to module build daemon
+  unsigned ModuleBuildDaemon : 1;
+
   CodeCompleteOptions CodeCompleteOpts;
 
   /// Specifies the output format of the AST.
@@ -435,6 +438,10 @@ class FrontendOptions {
   /// The output file, if any.
   std::string OutputFile;
 
+  /// If given, the path to the module build daemon's output files and socket
+  /// address
+  std::string ModuleBuildDaemonPath;
+
   /// If given, the new suffix for fix-it rewritten files.
   std::string FixItSuffix;
 
diff --git a/clang/include/clang/Tooling/ModuleBuildDaemon/Client.h b/clang/include/clang/Tooling/ModuleBuildDaemon/Client.h
new file mode 100644
index 000000000000000..d7506fa3011cffb
--- /dev/null
+++ b/clang/include/clang/Tooling/ModuleBuildDaemon/Client.h
@@ -0,0 +1,44 @@
+//===----------------------------- Protocol.h -----------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_CLIENT_H
+#define LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_CLIENT_H
+
+#include "clang/Frontend/CompilerInstance.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Config/llvm-config.h"
+#include "llvm/Support/YAMLParser.h"
+#include "llvm/Support/YAMLTraits.h"
+
+#define MAX_BUFFER 4096
+#define SOCKET_FILE_NAME "mbd.sock"
+#define STDOUT_FILE_NAME "mbd.out"
+#define STDERR_FILE_NAME "mbd.err"
+
+using namespace clang;
+using namespace llvm;
+
+namespace cc1modbuildd {
+
+// Returns where to store log files and socket address. Of the format
+// /tmp/clang-<BLAKE3HashOfClagnFullVersion>/
+std::string getBasePath();
+
+llvm::Error attemptHandshake(int SocketFD);
+
+llvm::Error spawnModuleBuildDaemon(StringRef BasePath, const char *Argv0);
+
+Expected<int> getModuleBuildDaemon(const char *Argv0, StringRef BasePath);
+
+llvm::Error handshakeModuleBuildDaemon(const CompilerInvocation &Clang,
+                                       const char *Argv0);
+
+} // namespace cc1modbuildd
+
+#endif // LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_PROTOCAL_H
diff --git a/clang/include/clang/Tooling/ModuleBuildDaemon/SocketMsgSupport.h b/clang/include/clang/Tooling/ModuleBuildDaemon/SocketMsgSupport.h
new file mode 100644
index 000000000000000..16666c177eaa80c
--- /dev/null
+++ b/clang/include/clang/Tooling/ModuleBuildDaemon/SocketMsgSupport.h
@@ -0,0 +1,134 @@
+//===------------------------- SocketMsgSupport.h -------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETMSGSUPPORT_H
+#define LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETMSGSUPPORT_H
+
+#include "clang/Tooling/ModuleBuildDaemon/Client.h"
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+
+using namespace clang;
+using namespace llvm;
+
+namespace cc1modbuildd {
+
+enum class ActionType { HANDSHAKE };
+enum class StatusType { REQUEST, SUCCESS, FAILURE };
+
+struct BaseMsg {
+  ActionType MsgAction;
+  StatusType MsgStatus;
+
+  BaseMsg() = default;
+  BaseMsg(ActionType Action, StatusType Status)
+      : MsgAction(Action), MsgStatus(Status) {}
+};
+
+struct HandshakeMsg : public BaseMsg {
+  HandshakeMsg() = default;
+  HandshakeMsg(ActionType Action, StatusType Status)
+      : BaseMsg(Action, Status) {}
+};
+
+template <typename T> std::string getBufferFromSocketMsg(T Msg) {
+  static_assert(std::is_base_of<cc1modbuildd::BaseMsg, T>::value,
+                "T must inherit from cc1modbuildd::BaseMsg");
+
+  std::string Buffer;
+  llvm::raw_string_ostream OS(Buffer);
+  llvm::yaml::Output YamlOut(OS);
+
+  YamlOut << Msg;
+  return Buffer;
+}
+
+template <typename T> Expected<T> getSocketMsgFromBuffer(const char *Buffer) {
+  static_assert(std::is_base_of<cc1modbuildd::BaseMsg, T>::value,
+                "T must inherit from cc1modbuildd::BaseMsg");
+
+  T ClientRequest;
+  llvm::yaml::Input YamlIn(Buffer);
+  YamlIn >> ClientRequest;
+
+  if (YamlIn.error()) {
+    std::string Msg = "Syntax or semantic error during YAML parsing";
+    return llvm::make_error<StringError>(Msg, inconvertibleErrorCode());
+  }
+
+  return ClientRequest;
+}
+
+template <typename T> Expected<T> readSocketMsgFromSocket(int FD) {
+  static_assert(std::is_base_of<cc1modbuildd::BaseMsg, T>::value,
+                "T must inherit from cc1modbuildd::BaseMsg");
+
+  Expected<std::string> MaybeResponseBuffer = readFromSocket(FD);
+  if (!MaybeResponseBuffer)
+    return std::move(MaybeResponseBuffer.takeError());
+
+  // Wait for response from module build daemon
+  Expected<T> MaybeResponse =
+      getSocketMsgFromBuffer<T>(std::move(*MaybeResponseBuffer).c_str());
+  if (!MaybeResponse)
+    return std::move(MaybeResponse.takeError());
+  return std::move(*MaybeResponse);
+}
+
+template <typename T> llvm::Error writeSocketMsgToSocket(T Msg, int FD) {
+  static_assert(std::is_base_of<cc1modbuildd::BaseMsg, T>::value,
+                "T must inherit from cc1modbuildd::BaseMsg");
+
+  std::string Buffer = getBufferFromSocketMsg(Msg);
+  if (llvm::Error Err = writeToSocket(Buffer, FD))
+    return std::move(Err);
+
+  return llvm::Error::success();
+}
+
+template <typename T>
+Expected<int> connectAndWriteSocketMsgToSocket(T Msg, StringRef SocketPath) {
+  static_assert(std::is_base_of<cc1modbuildd::BaseMsg, T>::value,
+                "T must inherit from cc1modbuildd::BaseMsg");
+
+  Expected<int> MaybeFD = connectToSocket(SocketPath);
+  if (!MaybeFD)
+    return std::move(MaybeFD.takeError());
+  int FD = std::move(*MaybeFD);
+
+  if (llvm::Error Err = writeSocketMsgToSocket(Msg, FD))
+    return std::move(Err);
+
+  return FD;
+}
+
+} // namespace cc1modbuildd
+
+template <>
+struct llvm::yaml::ScalarEnumerationTraits<cc1modbuildd::StatusType> {
+  static void enumeration(IO &Io, cc1modbuildd::StatusType &Value) {
+    Io.enumCase(Value, "REQUEST", cc1modbuildd::StatusType::REQUEST);
+    Io.enumCase(Value, "SUCCESS", cc1modbuildd::StatusType::SUCCESS);
+    Io.enumCase(Value, "FAILURE", cc1modbuildd::StatusType::FAILURE);
+  }
+};
+
+template <>
+struct llvm::yaml::ScalarEnumerationTraits<cc1modbuildd::ActionType> {
+  static void enumeration(IO &Io, cc1modbuildd::ActionType &Value) {
+    Io.enumCase(Value, "HANDSHAKE", cc1modbuildd::ActionType::HANDSHAKE);
+  }
+};
+
+template <> struct llvm::yaml::MappingTraits<cc1modbuildd::HandshakeMsg> {
+  static void mapping(IO &Io, cc1modbuildd::HandshakeMsg &Info) {
+    Io.mapRequired("Action", Info.MsgAction);
+    Io.mapRequired("Status", Info.MsgStatus);
+  }
+};
+
+#endif // LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETMSGSUPPORT_H
diff --git a/clang/include/clang/Tooling/ModuleBuildDaemon/SocketSupport.h b/clang/include/clang/Tooling/ModuleBuildDaemon/SocketSupport.h
new file mode 100644
index 000000000000000..bc21084faab3966
--- /dev/null
+++ b/clang/include/clang/Tooling/ModuleBuildDaemon/SocketSupport.h
@@ -0,0 +1,31 @@
+//===-------------------------- SocketSupport.h ---------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETSUPPORT_H
+#define LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETSUPPORT_H
+
+#include "clang/Frontend/CompilerInstance.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/YAMLParser.h"
+#include "llvm/Support/YAMLTraits.h"
+
+using namespace clang;
+using namespace llvm;
+
+namespace cc1modbuildd {
+
+Expected<int> createSocket();
+Expected<int> connectToSocket(StringRef SocketPath);
+Expected<int> connectAndWriteToSocket(std::string Buffer, StringRef SocketPath);
+Expected<std::string> readFromSocket(int FD);
+llvm::Error writeToSocket(std::string Buffer, int WriteFD);
+
+} // namespace cc1modbuildd
+
+#endif // LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_SOCKETSUPPORT_H
diff --git a/clang/include/clang/Tooling/ModuleBuildDaemon/Utils.h b/clang/include/clang/Tooling/ModuleBuildDaemon/Utils.h
new file mode 100644
index 000000000000000..79a2ffc3c1804d8
--- /dev/null
+++ b/clang/include/clang/Tooling/ModuleBuildDaemon/Utils.h
@@ -0,0 +1,28 @@
+//===------------------------------ Utils.h -------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Functions required by both the module build daemon (server) and clang
+// invocation (client)
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_UTILS_H
+#define LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_UTILS_H
+
+#include "llvm/Support/Error.h"
+#include <string>
+
+namespace cc1modbuildd {
+
+void writeError(llvm::Error Err, std::string Msg);
+std::string getFullErrorMsg(llvm::Error Err, std::string Msg);
+llvm::Error makeStringError(llvm::Error Err, std::string Msg);
+
+} // namespace cc1modbuildd
+
+#endif // LLVM_CLANG_TOOLING_MODULEBUILDDAEMON_UTILS_H
\ No newline at end of file
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index dda6aef641904aa..58c574d88752f9d 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5,7 +5,6 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 //===----------------------------------------------------------------------===//
-
 #include "Clang.h"
 #include "AMDGPU.h"
 #include "Arch/AArch64.h"
@@ -3738,6 +3737,19 @@ static bool RenderModulesOptions(Compilation &C, const Driver &D,
        Std->containsValue("c++latest") || Std->containsValue("gnu++latest"));
   bool HaveModules = HaveStdCXXModules;
 
+  // -fmodule-build-daemon enables module build daemon functionality
+  if (Args.hasArg(options::OPT_fmodule_build_daemon))
+    Args.AddLastArg(CmdArgs, options::OPT_fmodule_build_daemon);
+
+  // by default module build daemon socket address and output files are saved
+  // under /tmp/ but that can be overridden by providing the
+  // -fmodule-build-daemon=<path> flag
+  if (Arg *A = Args.getLastArg(options::OPT_fmodule_build_daemon_EQ)) {
+    CmdArgs.push_back(
+        Args.MakeArgString(Twine("-fmodule-build-daemon=") + A->getValue()));
+    CmdArgs.push_back("-fmodule-build-daemon");
+  }
+
   // -fmodules enables the use of precompiled modules (off by default).
   // Users can pass -fno-cxx-modules to turn off modules support for
   // C++/Objective-C++ programs.
diff --git a/clang/lib/Tooling/CMakeLists.txt b/clang/lib/Tooling/CMakeLists.txt
index aff39e4de13c0b2..85752e577332650 100644
--- a/clang/lib/Tooling/CMakeLists.txt
+++ b/clang/lib/Tooling/CMakeLists.txt
@@ -13,6 +13,7 @@ add_subdirectory(DumpTool)
 add_subdirectory(Syntax)
 add_subdirectory(DependencyScanning)
 add_subdirectory(Transformer)
+add_subdirectory(ModuleBuildDaemon)
 
 # Replace the last lib component of the current binary directory with include
 string(FIND ${CMAKE_CURRENT_BINARY_DIR} "/lib/" PATH_LIB_START REVERSE)
diff --git a/clang/lib/Tooling/ModuleBuildDaemon/CMakeLists.txt b/clang/lib/Tooling/ModuleBuildDaemon/CMakeLists.txt
new file mode 100644
index 000000000000000..9c1f5dc1aa2c0c0
--- /dev/null
+++ b/clang/lib/Tooling/ModuleBuildDaemon/CMakeLists.txt
@@ -0,0 +1,9 @@
+set(LLVM_LINK_COMPONENTS
+  Support
+  )
+
+add_clang_library(clangModuleBuildDaemon
+  Client.cpp
+  SocketSupport.cpp
+  Utils.cpp
+  )
diff --git a/clang/lib/Tooling/ModuleBuildDaemon/Client.cpp b/clang/lib/Tooling/ModuleBuildDaemon/Client.cpp
new file mode 100644
index 000000000000000..f247c4c64da957a
--- /dev/null
+++ b/clang/lib/Tooling/ModuleBuildDaemon/Client.cpp
@@ -0,0 +1,167 @@
+//===----------------------------- Client.cpp -----------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Tooling/ModuleBuildDaemon/Client.h"
+#include "clang/Basic/Version.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "clang/Tooling/ModuleBuildDaemon/SocketMsgSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/ScopeExit.h"
+#include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/BLAKE3.h"
+
+// TODO: Make portable
+#if LLVM_ON_UNIX
+
+#include <cerrno>
+#include <filesystem>
+#include <fstream>
+#include <signal.h>
+#include <spawn.h>
+#include <string>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/un.h>
+#include <unistd.h>
+
+using namespace clang;
+using namespace llvm;
+using namespace cc1modbuildd;
+
+std::string cc1modbuildd::getBasePath() {
+  llvm::BLAKE3 Hash;
+  Hash.update(getClangFullVersion());
+  auto HashResult = Hash.final<sizeof(uint64_t)>();
+  uint64_t HashValue =
+      llvm::support::endian::read<uint64_t, llvm::support::native>(
+          HashResult.data());
+  std::string Key = toString(llvm::APInt(64, HashValue), 36, /*Signed*/ false);
+
+  // set paths
+  SmallString<128> BasePath;
+  llvm::sys::path::system_temp_directory(/*erasedOnReboot*/ true, BasePath);
+  llvm::sys::path::append(BasePath, "clang-" + Key);
+  return BasePath.c_str();
+}
+
+llvm::Error cc1modbuildd::attemptHandshake(int SocketFD) {
+
+  HandshakeMsg Request{ActionType::HANDSHAKE, StatusType::REQUEST};
+  std::string Buffer = getBufferFromSocketMsg(Request);
+
+  if (llvm::Error Err = writeToSocket(Buffer, SocketFD))
+    return std::move(Err);
+
+  Expected<HandshakeMsg> MaybeServerResponse =
+      readSocketMsgFromSocket<HandshakeMsg>(SocketFD);
+  if (!MaybeServerResponse)
+    return std::move(MaybeServerResponse.takeError());
+
+  HandshakeMsg ServerResponse = std::move(*MaybeServerResponse);
+
+  assert(ServerResponse.MsgAction == ActionType::HANDSHAKE &&
+         "At this point response ActionType should only ever be HANDSHAKE");
+
+  if (ServerResponse.MsgStatus == StatusType::SUCCESS)
+    return llvm::Error::success();
+
+  return llvm::make_error<StringError>(
+      "Received failed handshake response from module build daemon",
+      inconvertibleErrorCode());
+}
+
+llvm::Error cc1modbuildd::spawnModuleBuildDaemon(StringRef BasePath,
+                                                 const char *Argv0) {
+  std::string BasePathStr = BasePath.str();
+  const char *Args[] = {Argv0, "-cc1modbuildd", BasePathStr.c_str(), nullptr};
+  pid_t pid;
+  int EC = posix_spawn(&pid, Args[0],
+                       /*file_actions*/ nullptr,
+                       /*spawnattr*/ nullptr, const_cast<char **>(Args),
+                       /*envp*/ nullptr);
+  if (EC)
+    return createStringError(std::error_code(EC, std::generic_category()),
+                             "failed to spawn module build daemon");
+
+  return llvm::Error::success();
+}
+
+Expected<int> cc1modbuildd::getModuleBuildDaemon(const char *Argv0,
+                                                 StringRef BasePath) {
+
+  SmallString<128> SocketPath = BasePath;
+  llvm::sys::path::append(SocketPath, SOCKET_FILE_NAME);
+
+  if (llvm::sys::fs::exists(SocketPath)) {
+    Expected<int> MaybeFD = connectToSocket(SocketPath);
+    if (MaybeFD)
+      return std::move(*MaybeFD);
+    consumeError(MaybeFD.takeError());
+  }
+
+  if (llvm::Error Err = cc1modbuildd::spawnModuleBuildDaemon(BasePath, Argv0))
+    return std::move(Err);
+
+  const unsigned int MICROSEC_IN_SEC = 1000000;
+  constexpr unsigned int MAX_TIME = 30 * MICROSEC_IN_SEC;
+  const unsigned short INTERVAL = 100;
+
+  unsigned int CumulativeTime = 0;
+  unsigned int WaitTime = 0;
+
+  while (CumulativeTime <= MAX_TIME) {
+    // Wait a bit then check to see if the module build daemon has initialized
+    usleep(WaitTime);
+
+    if (llvm::sys::fs::exists(SocketPath)) {
+      Expected<int> MaybeFD = connectToSocket(SocketPath);
+      if (MaybeFD)
+        return std::move(*MaybeFD);
+      consumeError(MaybeFD.takeError());
+    }
+
+    CumulativeTime += INTERVAL;
+  }
+
+  // After waiting 30 seconds give up
+  return llvm::make_error<StringError>(
+      "Module build daemon could not be spawned", inconvertibleErrorCode());
+}
+
+llvm::Error
+cc1modbuildd::handshakeModuleBuildDaemon(const CompilerInvocation &Clang,
+                                         const char *Argv0) {
+
+  // The module build daemon stores all output files and its socket address
+  // under BasePath. Either set BasePath to a user provided option or create an
+  // appropriate BasePath based on the hash of the clang version
+  std::string BasePath;
+  if (!Clang.getFrontendOpts().ModuleBuildDaemonPath.empty())
+    BasePath = Clang.getFrontendOpts().ModuleBuildDaemonPath;
+  else
+    BasePath = cc1modbuildd::getBasePath();
+
+  // If module build daemon does not exist spawn module build daemon
+  Expected<int> MaybeDaemonFD =
+      cc1modbuildd::getModuleBuildDaemon(Argv0, BasePath);
+  if (!MaybeDaemonFD)
+    return makeStringError(MaybeDaemonFD.takeError(),
+                           "Attempt to connect to daemon has failed: ");
+  int DaemonFD = std::move(*MaybeDaemonFD);
+
+  if (llvm::Error HandshakeErr = attemptHandshake(DaemonFD))
+    return makeStringError(std::move(HandshakeErr),
+                           "Attempted hadshake with daemon has failed: ");
+
+  return llvm::Error::success();
+}
+
+#endif // LLVM_ON_UNIX
diff --git a/clang/lib/Tooling/ModuleBuildDaemon/SocketSupport.cpp b/clang/lib/Tooling/ModuleBuildDaemon/SocketSupport.cpp
new file mode 100644
index 000000000000000..58526e4422f457b
--- /dev/null
+++ b/clang/lib/Tooling/ModuleBuildDaemon/SocketSupport.cpp
@@ -0,0 +1,128 @@
+//===------------------------- SocketSupport.cpp --------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/L...
[truncated]

@cpsughrue cpsughrue force-pushed the MBD_handshake branch 3 times, most recently from ecea06a to 9c8e6c7 Compare September 27, 2023 18:53
@github-actions
Copy link

github-actions bot commented Sep 29, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@cpsughrue cpsughrue force-pushed the MBD_handshake branch 2 times, most recently from 7f57561 to ca3d6f9 Compare September 30, 2023 14:50
@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Sep 30, 2023
Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the PR in details but one thing we found difficult to do is how to write a test for daemon spawning logic, because if anything went wrong, you left an unbounded daemon process on CI node.

The strategy in this PR (try to kill it after the test finishes) definitely doesn't work because 1. if anything failed, you will not hit kill. 2. you are going to kill the daemon from a different test that runs in parallel.

Other things we tried and know didn't work:

  • explicitly start/stop daemon with command: you might not hit stop due to failure
  • have a timeout and let the daemon terminate after that: you can have some infinite loop that prevents daemon cleanup

What we ended up is to have a server process that spawns clang job that needs to talk to it. We eliminated the run off daemons but we don't really have coverage for how clang spawns a daemon when needed.

@cpsughrue
Copy link
Contributor Author

I haven't read the PR in details but one thing we found difficult to do is how to write a test for daemon spawning logic, because if anything went wrong, you left an unbounded daemon process on CI node.

The strategy in this PR (try to kill it after the test finishes) definitely doesn't work because 1. if anything failed, you will not hit kill. 2. you are going to kill the daemon from a different test that runs in parallel.

  1. Great point, I did not think of that.
  2. The tests are set up so that each one spawns, interacts with, and kills its own daemon, so parallel tests shouldn't kill one another's daemon.

Other things we tried and know didn't work:

  • explicitly start/stop daemon with command: you might not hit stop due to failure
  • have a timeout and let the daemon terminate after that: you can have some infinite loop that prevents daemon cleanup

What we ended up is to have a server process that spawns clang job that needs to talk to it. We eliminated the run off daemons but we don't really have coverage for how clang spawns a daemon when needed.

Implementing a timeout as part of this patch as a safety measure seems worth while too but I'm not sure I understand your final solution. You all spawned a clang job that had to communicate with the daemon to make sure one did not exist?

@iains
Copy link
Contributor

iains commented Oct 8, 2023

A possible mechanism might be to have a watchdog process, with which instances of the server are registered with a time-to-live. I guess this might be easier to construct as a scheme for test-suites than for end-user.

@cachemeifyoucan
Copy link
Collaborator

Implementing a timeout as part of this patch as a safety measure seems worth while too but I'm not sure I understand your final solution. You all spawned a clang job that had to communicate with the daemon to make sure one did not exist?

It is something like this: https://github.com/apple/llvm-project/blob/next/clang/test/CAS/fdepscan-daemon.c

The "daemon" binary can be launched as a normal process and it takes -- option, which has the arguments of the clang process it needs to run that talk to itself. The lifetime of the "daemon" is now bounded to that normal process and will not be left behind after testing is done.

Copy link
Contributor

@iains iains left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a couple of questions. I guess my main concern here is that the TODO for portability might turn out to be quite heavy lifting. Perhaps we should be looking to abstract the socket-like interface at a lower level.

@Bigcheese
Copy link
Contributor

Implementing a timeout as part of this patch as a safety measure seems worth while too but I'm not sure I understand your final solution. You all spawned a clang job that had to communicate with the daemon to make sure one did not exist?

It is something like this: https://github.com/apple/llvm-project/blob/next/clang/test/CAS/fdepscan-daemon.c

The "daemon" binary can be launched as a normal process and it takes -- option, which has the arguments of the clang process it needs to run that talk to itself. The lifetime of the "daemon" is now bounded to that normal process and will not be left behind after testing is done.

I agree with doing this for all the non daemon launch tests. For those you can just run clang in a way that lit doesn't care if it fails, kill the daemon, then check the output of clang.

It doesn't really need to do a fully clean shutdown as no other process should be connecting, it can just call shutdownDaemon.

@Bigcheese
Copy link
Contributor

I had a couple of questions. I guess my main concern here is that the TODO for portability might turn out to be quite heavy lifting. Perhaps we should be looking to abstract the socket-like interface at a lower level.

We have someone working on a simple socket library for LLVM support that can handle the portability issues, plan is to land that first.

cpsughrue added 28 commits July 24, 2024 10:58
…efault state after ServerListener is destructed
…ext daemon can remove the file before trying to bind
…indows and improve remove socket file capabilities
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category llvm:support platform:windows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants