Skip to content

[clang] Enable making the module build stack thread-safe #137059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2025

Conversation

jansvoboda11
Copy link
Contributor

This PR makes another piece of the CompilerInstance::cloneForModuleCompile() result thread-safe: the module build stack. This data structure is used to detect cyclic dependencies between modules. The problem is that it uses FullSourceLoc which refers to the SourceManager of the parent CompilerInstance: if two threads happen to execute CompilerInstances cloned from the same parent concurrently, and both discover a dependency cycle, they may concurrently access the parent SourceManager when emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the responsibility of cycle detection to the client. The client can recreate the same module build stack externally and ensure thread-safety by enforcing mutual exclusion.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 23, 2025

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

This PR makes another piece of the CompilerInstance::cloneForModuleCompile() result thread-safe: the module build stack. This data structure is used to detect cyclic dependencies between modules. The problem is that it uses FullSourceLoc which refers to the SourceManager of the parent CompilerInstance: if two threads happen to execute CompilerInstances cloned from the same parent concurrently, and both discover a dependency cycle, they may concurrently access the parent SourceManager when emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the responsibility of cycle detection to the client. The client can recreate the same module build stack externally and ensure thread-safety by enforcing mutual exclusion.


Full diff: https://github.com/llvm/llvm-project/pull/137059.diff

1 Files Affected:

  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+9-5)
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index 8596dd03148e8..1526ea53add7d 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -1240,11 +1240,15 @@ std::unique_ptr<CompilerInstance> CompilerInstance::cloneForModuleCompileImpl(
   Instance.createSourceManager(Instance.getFileManager());
   SourceManager &SourceMgr = Instance.getSourceManager();
 
-  // Note that this module is part of the module build stack, so that we
-  // can detect cycles in the module graph.
-  SourceMgr.setModuleBuildStack(getSourceManager().getModuleBuildStack());
-  SourceMgr.pushModuleBuildStack(ModuleName,
-                                 FullSourceLoc(ImportLoc, getSourceManager()));
+  if (ThreadSafeConfig) {
+    // Detecting cycles in the module graph is responsibility of the client.
+  } else {
+    // Note that this module is part of the module build stack, so that we
+    // can detect cycles in the module graph.
+    SourceMgr.setModuleBuildStack(getSourceManager().getModuleBuildStack());
+    SourceMgr.pushModuleBuildStack(
+        ModuleName, FullSourceLoc(ImportLoc, getSourceManager()));
+  }
 
   // Make a copy for the new instance.
   Instance.FailedModules = FailedModules;

@jansvoboda11 jansvoboda11 merged commit 8832a59 into llvm:main Apr 24, 2025
13 checks passed
@jansvoboda11 jansvoboda11 deleted the thread-safe-module-build-stack branch April 24, 2025 17:18
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 24, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building clang at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/4080

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libarcher :: races/lock-unrelated.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 13
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp  -gdwarf-4 -O1 -fsanitize=thread  -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src   /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp -gdwarf-4 -O1 -fsanitize=thread -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic
# note: command had no output on stdout or stderr
# executed command: env TSAN_OPTIONS=ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp
# note: command had no output on stdout or stderr
# executed command: tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log
# note: command had no output on stdout or stderr
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# .---command stderr------------
# | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c:47:11: error: CHECK: expected string not found in input
# | // CHECK: ThreadSanitizer: reported {{[1-7]}} warnings
# |           ^
# | <stdin>:26:5: note: scanning from here
# | DONE
# |     ^
# | <stdin>:27:1: note: possible intended match here
# | ThreadSanitizer: thread T4 finished with ignores enabled, created at:
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |            21:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1045:3 (lock-unrelated.c.tmp+0xa2c1a) 
# |            22:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcac42) 
# |            23:  
# |            24: SUMMARY: ThreadSanitizer: data race /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c:31:8 in main.omp_outlined_debug__ 
# |            25: ================== 
# |            26: DONE 
# | check:47'0         X error: no match found
# |            27: ThreadSanitizer: thread T4 finished with ignores enabled, created at: 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | check:47'1     ?                                                                      possible intended match
# |            28:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1045:3 (lock-unrelated.c.tmp+0xa2c1a) 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            29:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcac42) 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            30:  
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 24, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-5 while building clang at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/18922

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli -jit-kind=orc-lazy -compile-threads=2 -thread-entry hello /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll # RUN: at line 1
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli -jit-kind=orc-lazy -compile-threads=2 -thread-entry hello /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
 #0 0x0000000103491600 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100ee1600)
 #1 0x000000010348f684 llvm::sys::RunSignalHandlers() (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100edf684)
 #2 0x0000000103491cbc SignalHandler(int, __siginfo*, void*) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100ee1cbc)
 #3 0x000000018db57584 (/usr/lib/system/libsystem_platform.dylib+0x18047b584)
 #4 0x000000018db2621c (/usr/lib/system/libsystem_pthread.dylib+0x18044a21c)
 #5 0x000000018da4cad0 (/usr/lib/libc++.1.dylib+0x180370ad0)
 #6 0x0000000103031d68 void llvm::detail::UniqueFunctionBase<void, llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef>>>>::CallImpl<llvm::orc::Platform::lookupInitSymbols(llvm::orc::ExecutionSession&, llvm::DenseMap<llvm::orc::JITDylib*, llvm::orc::SymbolLookupSet, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::orc::SymbolLookupSet>> const&)::$_45>(void*, llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::orc::ExecutorSymbolDef>>>&) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100a81d68)
 #7 0x000000010302d99c llvm::orc::AsynchronousSymbolQuery::handleComplete(llvm::orc::ExecutionSession&)::RunQueryCompleteTask::run() (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100a7d99c)
 #8 0x00000001030e9bfc void* std::__1::__thread_proxy[abi:un170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, llvm::orc::DynamicThreadPoolTaskDispatcher::dispatch(std::__1::unique_ptr<llvm::orc::Task, std::__1::default_delete<llvm::orc::Task>>)::$_0>>(void*) (/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/lli+0x100b39bfc)
 #9 0x000000018db26f94 (/usr/lib/system/libsystem_pthread.dylib+0x18044af94)
#10 0x000000018db21d34 (/usr/lib/system/libsystem_pthread.dylib+0x180445d34)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll

--

********************


jansvoboda11 added a commit to jansvoboda11/llvm-project that referenced this pull request Apr 24, 2025
jansvoboda11 added a commit that referenced this pull request Apr 29, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This PR makes another piece of the
`CompilerInstance::cloneForModuleCompile()` result thread-safe: the
module build stack. This data structure is used to detect cyclic
dependencies between modules. The problem is that it uses
`FullSourceLoc` which refers to the `SourceManager` of the parent
`CompilerInstance`: if two threads happen to execute `CompilerInstance`s
cloned from the same parent concurrently, and both discover a dependency
cycle, they may concurrently access the parent `SourceManager` when
emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the
responsibility of cycle detection to the client. The client can recreate
the same module build stack externally and ensure thread-safety by
enforcing mutual exclusion.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This PR makes another piece of the
`CompilerInstance::cloneForModuleCompile()` result thread-safe: the
module build stack. This data structure is used to detect cyclic
dependencies between modules. The problem is that it uses
`FullSourceLoc` which refers to the `SourceManager` of the parent
`CompilerInstance`: if two threads happen to execute `CompilerInstance`s
cloned from the same parent concurrently, and both discover a dependency
cycle, they may concurrently access the parent `SourceManager` when
emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the
responsibility of cycle detection to the client. The client can recreate
the same module build stack externally and ensure thread-safety by
enforcing mutual exclusion.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This PR makes another piece of the
`CompilerInstance::cloneForModuleCompile()` result thread-safe: the
module build stack. This data structure is used to detect cyclic
dependencies between modules. The problem is that it uses
`FullSourceLoc` which refers to the `SourceManager` of the parent
`CompilerInstance`: if two threads happen to execute `CompilerInstance`s
cloned from the same parent concurrently, and both discover a dependency
cycle, they may concurrently access the parent `SourceManager` when
emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the
responsibility of cycle detection to the client. The client can recreate
the same module build stack externally and ensure thread-safety by
enforcing mutual exclusion.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
This PR makes another piece of the
`CompilerInstance::cloneForModuleCompile()` result thread-safe: the
module build stack. This data structure is used to detect cyclic
dependencies between modules. The problem is that it uses
`FullSourceLoc` which refers to the `SourceManager` of the parent
`CompilerInstance`: if two threads happen to execute `CompilerInstance`s
cloned from the same parent concurrently, and both discover a dependency
cycle, they may concurrently access the parent `SourceManager` when
emitting the diagnostic, creating a data race.

In this PR, we prevent this by keeping the stack empty and moving the
responsibility of cycle detection to the client. The client can recreate
the same module build stack externally and ensure thread-safety by
enforcing mutual exclusion.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants