Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ROOT's llvm to llvm13. #10294

Merged
merged 154 commits into from
Dec 9, 2022
Merged

Conversation

vgvassilev
Copy link
Member

@vgvassilev vgvassilev commented Apr 1, 2022

The things we need to do before merging this PR and can probably be done by various people in parallel

Cling standalone:

  • Fix cling CUDA tests
  • Fix the remaining test failures (6, see below)
  • Revert the commit 'FIXME: Undo this change and debug why we have PendingInstances.'
Cling test failures

Failures in master on my system:

    Cling :: CodeUnloading/PCH/VTables.C
    Cling :: DynamicLibraryManager/callable_lib_L_AB_order1.C

Remaining failures (excluding the ones above):

  Cling :: CodeGeneration/Symbols.C
  Cling :: CodeUnloading/AtExit.C
  Cling :: CodeUnloading/PCH/VTablesClingPCH.C
  Cling :: CodeUnloading/RereadFile.C
  Cling :: ErrorRecovery/StoredState.C
  Cling :: MultipleInterpreters/MultipleInterpreters.C

ROOT:

  • Compare the build size against master
  • Compare the .pcm file size against master
  • Add flags to ignore compilation warnings coming from llvm
  • Remove the FIXME from commit 'Add another symbol generator to resolve the generated lazy symbol' - the explanation is in the commit
  • Fix the llvm::StringRef conversion failures on OSX
Binary Size this PR needs 13% more space (2.3 vs 2. GB)
du -hs root-release-llvm13
2.3G	.
(base) vvassilev@vv-nuc /build/vvassilev/root-release-llvm13 $ du -hs ../root-release-master/
2.0G	../root-release-master/
Module files need ~5% more space on disk (215 vs 206 MB)
diff -y llvm13 master 
424K	lib/ASImageGui.pcm				      |	444K	lib/ASImageGui.pcm
468K	lib/ASImage.pcm					      |	484K	lib/ASImage.pcm
4.2M	lib/_Builtin_intrinsics.pcm			      |	4.0M	lib/_Builtin_intrinsics.pcm
48K	lib/_Builtin_stddef_max_align_t.pcm		      |	44K	lib/_Builtin_stddef_max_align_t.pcm
200K	lib/Cling_Runtime_Extra.pcm			      |	132K	lib/Cling_Runtime_Extra.pcm
100K	lib/Cling_Runtime.pcm					100K	lib/Cling_Runtime.pcm
11M	lib/Core.pcm					      |	9.6M	lib/Core.pcm
564K	lib/EG.pcm					      |	584K	lib/EG.pcm
5.7M	lib/Eve.pcm					      |	5.4M	lib/Eve.pcm
652K	lib/FitPanel.pcm				      |	656K	lib/FitPanel.pcm
504K	lib/Foam.pcm					      |	520K	lib/Foam.pcm
440K	lib/Fumili.pcm					      |	460K	lib/Fumili.pcm
1.2M	lib/Gdml.pcm						1.2M	lib/Gdml.pcm
960K	lib/Ged.pcm					      |	968K	lib/Ged.pcm
432K	lib/Genetic.pcm					      |	456K	lib/Genetic.pcm
2.9M	lib/GenVector.pcm				      |	2.8M	lib/GenVector.pcm
868K	lib/GeomBuilder.pcm				      |	876K	lib/GeomBuilder.pcm
500K	lib/GeomPainter.pcm				      |	520K	lib/GeomPainter.pcm
3.4M	lib/Geom.pcm					      |	3.3M	lib/Geom.pcm
860K	lib/Gpad.pcm						860K	lib/Gpad.pcm
836K	lib/Graf3d.pcm					      |	844K	lib/Graf3d.pcm
1.0M	lib/Graf.pcm						1.0M	lib/Graf.pcm
540K	lib/GuiBld.pcm					      |	556K	lib/GuiBld.pcm
588K	lib/GuiHtml.pcm					      |	604K	lib/GuiHtml.pcm
3.5M	lib/Gui.pcm					      |	3.4M	lib/Gui.pcm
496K	lib/Gviz3d.pcm					      |	516K	lib/Gviz3d.pcm
468K	lib/GX11.pcm					      |	484K	lib/GX11.pcm
412K	lib/GX11TTF.pcm					      |	432K	lib/GX11TTF.pcm
3.6M	lib/HistFactory.pcm				      |	3.4M	lib/HistFactory.pcm
484K	lib/HistPainter.pcm				      |	500K	lib/HistPainter.pcm
5.9M	lib/Hist.pcm					      |	5.7M	lib/Hist.pcm
1.5M	lib/Html.pcm						1.5M	lib/Html.pcm
1.8M	lib/Imt.pcm					      |	1.7M	lib/Imt.pcm
1.9M	lib/libc.pcm						1.9M	lib/libc.pcm
12M	lib/MathCore.pcm				      |	11M	lib/MathCore.pcm
1.6M	lib/Matrix.pcm						1.6M	lib/Matrix.pcm
3.1M	lib/Minuit2.pcm					      |	3.0M	lib/Minuit2.pcm
544K	lib/Minuit.pcm					      |	560K	lib/Minuit.pcm
476K	lib/MLP.pcm					      |	496K	lib/MLP.pcm
1.2M	lib/MultiProc.pcm					1.2M	lib/MultiProc.pcm
1.1M	lib/Net.pcm						1.1M	lib/Net.pcm
712K	lib/NetxNG.pcm						712K	lib/NetxNG.pcm
728K	lib/Physics.pcm					      |	736K	lib/Physics.pcm
492K	lib/Postscript.pcm				      |	508K	lib/Postscript.pcm
564K	lib/ProofBench.pcm				      |	584K	lib/ProofBench.pcm
948K	lib/ProofDraw.pcm				      |	940K	lib/ProofDraw.pcm
1.6M	lib/Proof.pcm						1.6M	lib/Proof.pcm
732K	lib/ProofPlayer.pcm				      |	744K	lib/ProofPlayer.pcm
596K	lib/Quadp.pcm					      |	608K	lib/Quadp.pcm
392K	lib/RCsg.pcm					      |	412K	lib/RCsg.pcm
536K	lib/Recorder.pcm				      |	556K	lib/Recorder.pcm
5.4M	lib/RGL.pcm					      |	5.1M	lib/RGL.pcm
1.6M	lib/RHTTP.pcm					      |	1.5M	lib/RHTTP.pcm
412K	lib/RHTTPSniff.pcm				      |	436K	lib/RHTTPSniff.pcm
400K	lib/Rint.pcm					      |	420K	lib/Rint.pcm
2.6M	lib/RIO.pcm					      |	2.5M	lib/RIO.pcm
23M	lib/RooFitCore.pcm				      |	22M	lib/RooFitCore.pcm
1.1M	lib/RooFitHS3.pcm				      |	1008K	lib/RooFitHS3.pcm
16M	lib/RooFit.pcm					      |	15M	lib/RooFit.pcm
424K	lib/RooFitRDataFrameHelpers.pcm			      |	448K	lib/RooFitRDataFrameHelpers.pcm
4.3M	lib/RooStats.pcm				      |	4.1M	lib/RooStats.pcm
468K	lib/RootAuth.pcm				      |	484K	lib/RootAuth.pcm
120K	lib/ROOT_Config.pcm					120K	lib/ROOT_Config.pcm
15M	lib/ROOTDataFrame.pcm				      |	14M	lib/ROOTDataFrame.pcm
332K	lib/ROOT_Foundation_C.pcm				332K	lib/ROOT_Foundation_C.pcm
620K	lib/ROOT_Foundation_Stage1_NoRTTI.pcm		      |	600K	lib/ROOT_Foundation_Stage1_NoRTTI.pcm
140K	lib/ROOT_Rtypes.pcm					140K	lib/ROOT_Rtypes.pcm
4.1M	lib/ROOTTMVASofie.pcm					4.1M	lib/ROOTTMVASofie.pcm
412K	lib/ROOTTPython.pcm				      |	432K	lib/ROOTTPython.pcm
2.6M	lib/ROOTVecOps.pcm				      |	2.5M	lib/ROOTVecOps.pcm
652K	lib/SessionViewer.pcm				      |	668K	lib/SessionViewer.pcm
3.0M	lib/Smatrix.pcm					      |	2.9M	lib/Smatrix.pcm
436K	lib/SpectrumPainter.pcm				      |	456K	lib/SpectrumPainter.pcm
572K	lib/Spectrum.pcm				      |	584K	lib/Spectrum.pcm
424K	lib/SPlot.pcm					      |	440K	lib/SPlot.pcm
624K	lib/SQLIO.pcm					      |	640K	lib/SQLIO.pcm
18M	lib/std.pcm					      |	17M	lib/std.pcm
1.6M	lib/Thread.pcm					      |	1.5M	lib/Thread.pcm
568K	lib/TMVAGui.pcm					      |	588K	lib/TMVAGui.pcm
18M	lib/TMVA.pcm					      |	17M	lib/TMVA.pcm
2.6M	lib/Tree.pcm					      |	2.5M	lib/Tree.pcm
4.5M	lib/TreePlayer.pcm				      |	4.3M	lib/TreePlayer.pcm
668K	lib/TreeViewer.pcm				      |	684K	lib/TreeViewer.pcm
536K	lib/Unfold.pcm					      |	552K	lib/Unfold.pcm
424K	lib/X3d.pcm					      |	448K	lib/X3d.pcm
1.1M	lib/XMLIO.pcm					      |	1.0M	lib/XMLIO.pcm
444K	lib/XMLParser.pcm				      |	464K	lib/XMLParser.pcm

cc: @hahnjo, @Axel-Naumann

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as resolved.

@lgtm-com

This comment was marked as outdated.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as outdated.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@phsft-bot

This comment was marked as resolved.

@hahnjo
Copy link
Member

hahnjo commented Apr 5, 2022

I tested a bit on my end, I guess the llvm::StringRef conversion errors are the following:

/home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx: In function ‘std::__cxx11::string GetSharedLibImmediateDepsSlow(std::__cxx11::string, cling::Interpreter*, bool)’:
/home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:7119:25: error: ambiguous overload for ‘operator==’ (operand types are ‘llvm::StringRef’ and ‘const char [20]’)
             if (SymName == "_Jv_RegisterClasses" ||
                 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/jhahnfel/ROOT/llvm13/src/core/base/inc/TNamed.h:26,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TDictionary.h:44,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TDataType.h:25,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TInterpreter.h:25,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.h:27,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:20:
/home/jhahnfel/ROOT/llvm13/src/core/base/inc/TString.h:844:15: note: candidate: ‘Bool_t operator==(const string_view&, const char*)’
 inline Bool_t operator==(const std::string_view &s1, const char *s2)
               ^~~~~~~~
In file included from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/Basic/DiagnosticIDs.h:19,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/Basic/Diagnostic.h:17,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/AST/NestedNameSpecifier.h:18,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/AST/Type.h:21,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingDeclInfo.h:20,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingClassInfo.h:28,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingBaseClassInfo.h:29,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:24:
/home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ADT/StringRef.h:919:15: note: candidate: ‘bool llvm::operator==(llvm::StringRef, llvm::StringRef)’
   inline bool operator==(StringRef LHS, StringRef RHS) {
               ^~~~~~~~

This happens with a C++17 build in general, C++14 is fine.

On the performance side, the current state seems to veery slow: ctest -j12 -R dataframe . used to take in the order of 2m30s, now I aborted it after 13 minutes. A lot of time seems to be spent in sys, are you aware of changes that could explain this?

@vgvassilev
Copy link
Member Author

I tested a bit on my end, I guess the llvm::StringRef conversion errors are the following:

/home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx: In function ‘std::__cxx11::string GetSharedLibImmediateDepsSlow(std::__cxx11::string, cling::Interpreter*, bool)’:
/home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:7119:25: error: ambiguous overload for ‘operator==’ (operand types are ‘llvm::StringRef’ and ‘const char [20]’)
             if (SymName == "_Jv_RegisterClasses" ||
                 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/jhahnfel/ROOT/llvm13/src/core/base/inc/TNamed.h:26,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TDictionary.h:44,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TDataType.h:25,
                 from /home/jhahnfel/ROOT/llvm13/src/core/meta/inc/TInterpreter.h:25,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.h:27,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:20:
/home/jhahnfel/ROOT/llvm13/src/core/base/inc/TString.h:844:15: note: candidate: ‘Bool_t operator==(const string_view&, const char*)’
 inline Bool_t operator==(const std::string_view &s1, const char *s2)
               ^~~~~~~~
In file included from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/Basic/DiagnosticIDs.h:19,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/Basic/Diagnostic.h:17,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/AST/NestedNameSpecifier.h:18,
                 from /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/tools/clang/include/clang/AST/Type.h:21,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingDeclInfo.h:20,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingClassInfo.h:28,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingBaseClassInfo.h:29,
                 from /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:24:
/home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ADT/StringRef.h:919:15: note: candidate: ‘bool llvm::operator==(llvm::StringRef, llvm::StringRef)’
   inline bool operator==(StringRef LHS, StringRef RHS) {
               ^~~~~~~~

I do not understand why. Calling .str() is often too expensive but we may need to do it here.

This happens with a C++17 build in general, C++14 is fine.

On the performance side, the current state seems to veery slow: ctest -j12 -R dataframe . used to take in the order of 2m30s, now I aborted it after 13 minutes. A lot of time seems to be spent in sys, are you aware of changes that could explain this?

Can you paste the stack trace? I fear that the DynamicLibraryManagerSymbol.cpp stopped inlining functions due to some recent calls of ::StringRef::str()...

@hahnjo
Copy link
Member

hahnjo commented Apr 5, 2022

I tested a bit on my end, I guess the llvm::StringRef conversion errors are the following:

[...]

I do not understand why. Calling .str() is often too expensive but we may need to do it here.

I think the problem is that const char* is convertible to std::string_view and llvm::StringRef while the latter two are convertible to each other. So the compiler in principle can do both conversions and its ambiguous which one should be preferred. Using llvm::StringRef in the three cases explicitly fixes the build for me:

diff --git a/core/metacling/src/TCling.cxx b/core/metacling/src/TCling.cxx
index 0900c4d62a..b288aef228 100644
--- a/core/metacling/src/TCling.cxx
+++ b/core/metacling/src/TCling.cxx
@@ -3164,10 +3164,11 @@ Bool_t TCling::IsLoaded(const char* filename) const
    llvm::StringRef(filesStr).split(files, "\n");

    std::set<std::string> fileMap;
+   llvm::StringRef file_name_ref(file_name);
    // Fill fileMap; return early on exact match.
    for (llvm::SmallVector<llvm::StringRef, 100>::const_iterator
            iF = files.begin(), iE = files.end(); iF != iE; ++iF) {
-      if ((*iF) == file_name.c_str()) return kTRUE; // exact match
+      if ((*iF) == file_name_ref) return kTRUE; // exact match
       fileMap.insert(iF->str());
    }

@@ -7116,9 +7117,12 @@ static std::string GetSharedLibImmediateDepsSlow(std::string lib,
             // FIXME: It is unclear whether we can ignore all weak undefined
             // symbols:
             // http://lists.llvm.org/pipermail/llvm-dev/2017-October/118177.html
-            if (SymName == "_Jv_RegisterClasses" ||
-               SymName == "_ITM_deregisterTMCloneTable" ||
-               SymName == "_ITM_registerTMCloneTable")
+            static constexpr llvm::StringRef RegisterClasses("_Jv_RegisterClasses");
+            static constexpr llvm::StringRef RegisterCloneTable("_ITM_registerTMCloneTable");
+            static constexpr llvm::StringRef DeregisterCloneTable("_ITM_deregisterTMCloneTable");
+            if (SymName == RegisterClasses ||
+               SymName == RegisterCloneTable ||
+               SymName == DeregisterCloneTable)
                continue;
          }

Do you want me to submit this separately, outside this PR?

On the performance side, the current state seems to veery slow: ctest -j12 -R dataframe . used to take in the order of 2m30s, now I aborted it after 13 minutes. A lot of time seems to be spent in sys, are you aware of changes that could explain this?

Can you paste the stack trace? I fear that the DynamicLibraryManagerSymbol.cpp stopped inlining functions due to some recent calls of ::StringRef::str()...

perf says a number of kernel functions and indeed cling::Dyld::ContainsSymbol are the largest contenders. I don't understand why though, the annotations inside the functions make no sense to me (showing more than 50% on a mov %ebx,%r8d without something obvious around it).

@vgvassilev
Copy link
Member Author

I tested a bit on my end, I guess the llvm::StringRef conversion errors are the following:
[...]

I do not understand why. Calling .str() is often too expensive but we may need to do it here.

I think the problem is that const char* is convertible to std::string_view and llvm::StringRef while the latter two are convertible to each other. So the compiler in principle can do both conversions and its ambiguous which one should be preferred. Using llvm::StringRef in the three cases explicitly fixes the build for me:

diff --git a/core/metacling/src/TCling.cxx b/core/metacling/src/TCling.cxx
index 0900c4d62a..b288aef228 100644
--- a/core/metacling/src/TCling.cxx
+++ b/core/metacling/src/TCling.cxx
@@ -3164,10 +3164,11 @@ Bool_t TCling::IsLoaded(const char* filename) const
    llvm::StringRef(filesStr).split(files, "\n");

    std::set<std::string> fileMap;
+   llvm::StringRef file_name_ref(file_name);
    // Fill fileMap; return early on exact match.
    for (llvm::SmallVector<llvm::StringRef, 100>::const_iterator
            iF = files.begin(), iE = files.end(); iF != iE; ++iF) {
-      if ((*iF) == file_name.c_str()) return kTRUE; // exact match
+      if ((*iF) == file_name_ref) return kTRUE; // exact match
       fileMap.insert(iF->str());
    }

@@ -7116,9 +7117,12 @@ static std::string GetSharedLibImmediateDepsSlow(std::string lib,
             // FIXME: It is unclear whether we can ignore all weak undefined
             // symbols:
             // http://lists.llvm.org/pipermail/llvm-dev/2017-October/118177.html
-            if (SymName == "_Jv_RegisterClasses" ||
-               SymName == "_ITM_deregisterTMCloneTable" ||
-               SymName == "_ITM_registerTMCloneTable")
+            static constexpr llvm::StringRef RegisterClasses("_Jv_RegisterClasses");
+            static constexpr llvm::StringRef RegisterCloneTable("_ITM_registerTMCloneTable");
+            static constexpr llvm::StringRef DeregisterCloneTable("_ITM_deregisterTMCloneTable");
+            if (SymName == RegisterClasses ||
+               SymName == RegisterCloneTable ||
+               SymName == DeregisterCloneTable)
                continue;
          }

Do you want me to submit this separately, outside this PR?

I think you should be able to submit as part of this PR.

On the performance side, the current state seems to veery slow: ctest -j12 -R dataframe . used to take in the order of 2m30s, now I aborted it after 13 minutes. A lot of time seems to be spent in sys, are you aware of changes that could explain this?

Can you paste the stack trace? I fear that the DynamicLibraryManagerSymbol.cpp stopped inlining functions due to some recent calls of ::StringRef::str()...

perf says a number of kernel functions and indeed cling::Dyld::ContainsSymbol are the largest contenders. I don't understand why though, the annotations inside the functions make no sense to me (showing more than 50% on a mov %ebx,%r8d without something obvious around it).

yeah, this is a bit tricky, the profiler is useful if you built in debug mode. What usually works is removing the DynamicLibraryManagerSymbol.cpp.o and copying the build command, changing it to clang and adding [a variation of] -Rpass=inline -Rpass-missed=.* (https://godbolt.org/z/xGEsGhf97). Then you can compare old and the new version for missed optimization opportunities.

My feeling is that I mechanically added .str() everywhere as in some places we could have just used string_view and here this matters...

@hahnjo
Copy link
Member

hahnjo commented Apr 8, 2022

The last three commits are for Cling's CUDA support. It still doesn't fully work on my machine, but the errors are the same as master with LLVM 9 (complains about not finding symbols from libcudart.so even though that has been loaded; could be related to the CUDA version?) without assertions that I see tripping on master. Do we know which setup used to work for these tests? Maybe I'll have to install older versions of CUDA...

@vgvassilev
Copy link
Member Author

vgvassilev commented Apr 8, 2022

The last three commits are for Cling's CUDA support. It still doesn't fully work on my machine, but the errors are the same as master with LLVM 9 (complains about not finding symbols from libcudart.so even though that has been loaded; could be related to the CUDA version?) without assertions that I see tripping on master. Do we know which setup used to work for these tests? Maybe I'll have to install older versions of CUDA...

That sounds pretty good! I remember @SimeonEhrig mentioning some issues when loading the cuda library.

PS: if the cuda test state is the same as it is in the master maybe we can go off hunting the root test failures and eventually come back to cuda after?

@hahnjo
Copy link
Member

hahnjo commented Apr 13, 2022

Yes, I agree that we should now focus on the remaining test failures, both in Cling and ROOT. For the "file name too long" when building with GCC, I've posted #10387 (we'll have to rebase this PR afterwards and change a number of the new .str() calls).

I also started looking into the slow JIT for RDF, and I noticed that it's completely hanging when ROOT is built with C++17. The stack trace of a stuck ./tree/dataframe/test/dataframe_interface --gtest_filter=RDataFrameInterface.GetFilterNamesFromNode is

#0  0x00007ffff697681d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ffff696fac9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00007fffefd8e953 in __gthread_mutex_lock (__mutex=0x8eb600) at /usr/bin/../lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:748
#3  std::mutex::lock (this=0x8eb600) at /usr/bin/../lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/std_mutex.h:103
#4  std::unique_lock<std::mutex>::lock (this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/std_mutex.h:267
#5  std::unique_lock<std::mutex>::unique_lock (__m=..., this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/bits/std_mutex.h:197
#6  llvm::orc::ExecutionSession::OL_applyQueryPhase1 (this=this@entry=0xbcb8b0, IPLS=std::unique_ptr<llvm::orc::InProgressLookupState> = {...}, Err=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:2295
#7  0x00007fffefd8c8ec in llvm::orc::ExecutionSession::lookup(llvm::orc::LookupKind, std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags> > > const&, llvm::orc::SymbolLookupSet, llvm::orc::SymbolState, llvm::unique_function<void (llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::JITEvaluatedSymbol, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::JITEvaluatedSymbol> > >)>, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> >, llvm::DenseMapInfo<llvm::orc::JITDylib*>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> > > > const&)>) (this=0x8eb600, this@entry=0xbcb8b0, K=K@entry=llvm::orc::LookupKind::Static, SearchOrder=std::vector of length 1, capacity 1 = {...}, Symbols=..., RequiredState=RequiredState@entry=llvm::orc::SymbolState::Ready, NotifyComplete=..., 
    RegisterDependencies=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:1974
#8  0x00007fffefd9a9e6 in llvm::orc::Platform::lookupInitSymbols (ES=..., InitSyms=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:1723
#9  0x00007fffefdc25a2 in (anonymous namespace)::GenericLLVMIRPlatformSupport::issueInitLookups (this=0x8ea890, JD=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/LLJIT.cpp:384
#10 (anonymous namespace)::GenericLLVMIRPlatformSupport::getInitializers (this=0x8ea890, JD=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/LLJIT.cpp:262
#11 (anonymous namespace)::GenericLLVMIRPlatformSupport::initialize (this=0x8ea890, JD=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/LLJIT.cpp:215
#12 0x00007fffeeace5f9 in llvm::orc::LLJIT::initialize (this=0x8eb360, JD=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ExecutionEngine/Orc/LLJIT.h:155
#13 0x00007fffeeaccc6c in cling::IncrementalJIT::runCtors (this=0xfffffffffffffe00) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalJIT.h:74
#14 cling::IncrementalExecutor::runStaticInitializersOnce (this=0x698d80, T=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalExecutor.cpp:260
#15 0x00007fffeea63d43 in cling::Interpreter::executeTransaction (this=<optimized out>, T=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:1714
#16 0x00007fffeea6da53 in cling::IncrementalParser::commitTransaction (this=0x4f6670, PRT=..., ClearDiagClient=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalParser.cpp:675
#17 0x00007fffeea5dd71 in cling::Interpreter::PushTransactionRAII::pop (this=0x7fffffffb0c0) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:116
#18 cling::Interpreter::PushTransactionRAII::~PushTransactionRAII (this=0x7fffffffb0c0) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:106
#19 0x00007fffee9f9d90 in ClingMemberIterInternal::DCIter::DCIter (this=0x7fffffffb140, DC=<optimized out>, interp=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingMemberIter.cxx:33
#20 0x00007fffee9f649b in TClingMemberIter::TClingMemberIter (this=0x7fffffffb128, interp=0x0, DC=0x80) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingMemberIter.h:145
#21 TClingDataMemberIter::TClingDataMemberIter (this=0x7fffffffb128, interp=0x0, DC=0x80, selection=TDictionary::EMemberSelection::kAlsoUsingDecls) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingDataMemberInfo.h:66
#22 TClingDataMemberInfo::TClingDataMemberInfo (this=0xa1b5350, interp=0x4f17a0, ci=0xaacabd0, selection=TDictionary::EMemberSelection::kAlsoUsingDecls) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TClingDataMemberInfo.cxx:115
#23 0x00007fffee96cf5b in TCling::DataMemberInfo_Factory (this=0x4f0e00, clinfo=0xaacabd0, selection=TDictionary::EMemberSelection::kAlsoUsingDecls) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:8508
#24 0x00007ffff685869f in TListOfDataMembers::Load (this=0x16a61c0) at /home/jhahnfel/ROOT/llvm13/src/core/meta/src/TListOfDataMembers.cxx:469
#25 0x00007ffff675c6f7 in TROOT::GetListOfGlobals (this=0x7ffff6932358 <ROOT::Internal::GetROOT1()::alloc>, load=true) at /home/jhahnfel/ROOT/llvm13/src/core/base/src/TROOT.cxx:1760
#26 0x00007fffebd32f70 in _GLOBAL__sub_I_clingwrapper.cxx () at /home/jhahnfel/ROOT/llvm13/src/bindings/pyroot/cppyy/cppyy-backend/clingwrapper/src/clingwrapper.cxx:289
#27 0x00007ffff7de3e0a in call_init (l=<optimized out>, argc=argc@entry=2, argv=argv@entry=0x7fffffffddf8, env=env@entry=0x41d000) at dl-init.c:72
#28 0x00007ffff7de3f0a in call_init (env=0x41d000, argv=0x7fffffffddf8, argc=2, l=<optimized out>) at dl-init.c:30
#29 _dl_init (main_map=0xaaed240, argc=2, argv=0x7fffffffddf8, env=0x41d000) at dl-init.c:119
#30 0x00007ffff59a61bc in _dl_catch_exception () from /lib64/libc.so.6
#31 0x00007ffff7de7b2e in dl_open_worker (a=0x7fffffffbc10) at dl-open.c:819
#32 dl_open_worker (a=0x7fffffffbc10) at dl-open.c:782
#33 0x00007ffff59a6164 in _dl_catch_exception () from /lib64/libc.so.6
#34 0x00007ffff7de7d11 in _dl_open (file=0xaaee820 "/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", mode=<optimized out>, caller_dlopen=0x7fffeeb69fe8 <cling::utils::platform::DLOpen(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)+24>, 
    nsid=-2, argc=2, argv=<optimized out>, env=0x41d000) at dl-open.c:900
#35 0x00007ffff48011ea in dlopen_doit () from /lib64/libdl.so.2
#36 0x00007ffff59a6164 in _dl_catch_exception () from /lib64/libc.so.6
#37 0x00007ffff59a6223 in _dl_catch_error () from /lib64/libc.so.6
#38 0x00007ffff4801969 in _dlerror_run () from /lib64/libdl.so.2
#39 0x00007ffff480128a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#40 0x00007fffeeb69fe8 in cling::utils::platform::DLOpen (Path="/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", Err=Err@entry=0x7fffffffbf40) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Utils/PlatformPosix.cpp:118
#41 0x00007fffeea4cc41 in cling::DynamicLibraryManager::loadLibrary (this=0x698e90, libStem=..., permanent=<optimized out>, resolved=true) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/DynamicLibraryManager.cpp:373
#42 0x00007fffee9496fa in TCling::LibraryLoadingFailed (this=0x4f0e00, errmessage=..., libStem="/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", permanent=false, resolved=false) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:6492
#43 0x00007fffeea69296 in cling::MultiplexInterpreterCallbacks::LibraryLoadingFailed (this=<optimized out>, errmessage="/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so: undefined symbol: _Py_NoneStruct", libStem="/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", permanent=<optimized out>, resolved=<optimized out>)
    at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/MultiplexInterpreterCallbacks.h:102
#44 0x00007fffeea4ce05 in cling::DynamicLibraryManager::loadLibrary (this=<optimized out>, libStem=..., permanent=<optimized out>, resolved=true) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/DynamicLibraryManager.cpp:377
#45 0x00007fffee95d74a in TCling::Load (this=0x4f0e00, filename=0xa95fab0 "/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", system=true) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:3477
#46 0x00007ffff67c18a2 in TSystem::Load (this=0x41b800, module=0xa95f920 "/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", entry=0x7fffedaa9527 "", system=true) at /home/jhahnfel/ROOT/llvm13/src/core/base/src/TSystem.cxx:1942
#47 0x00007fffee94ca44 in TCling::LazyFunctionCreatorAutoload(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::$_3::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const (LibName="/home/jhahnfel/ROOT/llvm13/build-cling-clang/lib/libcppyy3_6.so", this=<optimized out>)
    at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:6513
#48 TCling::LazyFunctionCreatorAutoload (this=<optimized out>, mangled_name="_ZNSt11char_traitsIcE6lengthEPKc") at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:6539
#49 0x00007fffeeacca15 in cling::IncrementalExecutor::NotifyLazyFunctionCreators (this=0x698d80, mangled_name="_ZNSt11char_traitsIcE6lengthEPKc") at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalExecutor.cpp:195
#50 0x00007fffeead4997 in cling::HostLookupLazyFallbackGenerator::tryToGenerate (this=0x7fbc70, LS=..., K=<optimized out>, JD=..., JDLookupFlags=<optimized out>, Symbols=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalJIT.cpp:78
#51 0x00007fffefd8f246 in llvm::orc::ExecutionSession::OL_applyQueryPhase1 (this=this@entry=0xbcb8b0, IPLS=std::unique_ptr<llvm::orc::InProgressLookupState> = {...}, Err=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:2367
#52 0x00007fffefd8c8ec in llvm::orc::ExecutionSession::lookup(llvm::orc::LookupKind, std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags> > > const&, llvm::orc::SymbolLookupSet, llvm::orc::SymbolState, llvm::unique_function<void (llvm::Expected<llvm::DenseMap<llvm::orc::SymbolStringPtr, llvm::JITEvaluatedSymbol, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr>, llvm::detail::DenseMapPair<llvm::orc::SymbolStringPtr, llvm::JITEvaluatedSymbol> > >)>, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> >, llvm::DenseMapInfo<llvm::orc::JITDylib*>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> > > > const&)>) (this=0x8eb600, this@entry=0xbcb8b0, K=K@entry=llvm::orc::LookupKind::Static, SearchOrder=std::vector of length 1, capacity 1 = {...}, Symbols=..., RequiredState=llvm::orc::SymbolState::Ready, NotifyComplete=..., RegisterDependencies=...)
    at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:1974
#53 0x00007fffefd9cdaa in llvm::orc::ExecutionSession::lookup(std::vector<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags>, std::allocator<std::pair<llvm::orc::JITDylib*, llvm::orc::JITDylibLookupFlags> > > const&, llvm::orc::SymbolLookupSet const&, llvm::orc::LookupKind, llvm::orc::SymbolState, std::function<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> >, llvm::DenseMapInfo<llvm::orc::JITDylib*>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr> > > > const&)>) (this=this@entry=0xbcb8b0, SearchOrder=<error reading variable: Cannot access memory at address 0x8>, 
    Symbols=..., K=K@entry=llvm::orc::LookupKind::Static, RequiredState=llvm::orc::SymbolState::Ready, RegisterDependencies=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:2011
#54 0x00007fffefd9d0c5 in llvm::orc::ExecutionSession::lookup (this=0xbcb8b0, SearchOrder=std::vector of length 1, capacity 1 = {...}, Name=..., RequiredState=llvm::orc::SymbolState::Ready) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/Core.cpp:2036
#55 0x00007fffefdbe05c in llvm::orc::LLJIT::lookupLinkerMangled (this=<optimized out>, JD=..., Name=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/ExecutionEngine/Orc/LLJIT.cpp:651
#56 0x00007fffeead3ec6 in llvm::orc::LLJIT::lookupLinkerMangled (this=this@entry=0x8eb360, JD=..., Name=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ExecutionEngine/Orc/LLJIT.h:120
#57 0x00007fffeead14f8 in llvm::orc::LLJIT::lookup (this=0x8eb360, JD=..., UnmangledName=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ExecutionEngine/Orc/LLJIT.h:132
#58 llvm::orc::LLJIT::lookup (this=0x8eb360, UnmangledName=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/include/llvm/ExecutionEngine/Orc/LLJIT.h:137
#59 cling::IncrementalJIT::getSymbolAddress (this=0x500650, Name=..., IncludeHostSymbols=true) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalJIT.cpp:230
#60 0x00007fffeead65bd in (anonymous namespace)::ReuseExistingWeakSymbols::runOnGlobal (this=this@entry=0xffe5d0, GV=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/BackendPasses.cpp:150
#61 0x00007fffeead6523 in (anonymous namespace)::ReuseExistingWeakSymbols::runOnModule (this=0xffe5d0, M=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/BackendPasses.cpp:173
#62 0x00007ffff1be31f8 in (anonymous namespace)::MPPassManager::runOnModule (this=0xf6a4b0, M=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/IR/LegacyPassManager.cpp:1554
#63 llvm::legacy::PassManagerImpl::run (this=<optimized out>, M=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/llvm/src/lib/IR/LegacyPassManager.cpp:542
#64 0x00007fffeeaccbee in cling::IncrementalExecutor::emitModule (this=0x698d80, T=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalExecutor.h:253
#65 cling::IncrementalExecutor::runStaticInitializersOnce (this=0x698d80, T=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalExecutor.cpp:251
#66 0x00007fffeea63d43 in cling::Interpreter::executeTransaction (this=<optimized out>, T=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:1714
#67 0x00007fffeea6da53 in cling::IncrementalParser::commitTransaction (this=0x4f6670, PRT=..., ClearDiagClient=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalParser.cpp:675
#68 0x00007fffeea6e410 in cling::IncrementalParser::Compile (this=0x4f6670, input=..., Opts=...) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/IncrementalParser.cpp:846
#69 0x00007fffeea62615 in cling::Interpreter::EvaluateInternal (this=0x4f17a0, input=..., CO=..., V=0x7fffffffcc70, wrapPoint=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/interpreter/cling/lib/Interpreter/Interpreter.cpp:1379
#70 0x00007fffee95d98f in TCling::Calc (this=0x4f0e00, line=0x1c224b0 "ROOT::Internal::RDF::JitFilterHelper(R_rdf::lambda0, new const char*[1]{\"a\"}, 1, \"\", reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedFilter>*>(0x21269a0), reinterpret_cast<std::shared_ptr<ROO"..., error=0x7fffffffcd0c) at /home/jhahnfel/ROOT/llvm13/src/core/metacling/src/TCling.cxx:3556
#71 0x00007ffff7fc97b3 in ROOT::Internal::RDF::InterpreterCalc(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::$_0::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const (
    codeSlice="ROOT::Internal::RDF::JitFilterHelper(R_rdf::lambda0, new const char*[1]{\"a\"}, 1, \"\", reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedFilter>*>(0x21269a0), reinterpret_cast<std::shared_ptr<ROO"..., this=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/tree/dataframe/src/RDFUtils.cxx:339
#72 ROOT::Internal::RDF::InterpreterCalc (code="ROOT::Internal::RDF::JitFilterHelper(R_rdf::lambda0, new const char*[1]{\"a\"}, 1, \"\", reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedFilter>*>(0x21269a0), reinterpret_cast<std::shared_ptr<ROO"..., context="RLoopManager::Run") at /home/jhahnfel/ROOT/llvm13/src/tree/dataframe/src/RDFUtils.cxx:362
#73 0x00007ffff7fd2f04 in ROOT::Detail::RDF::RLoopManager::Jit (this=<optimized out>) at /home/jhahnfel/ROOT/llvm13/src/tree/dataframe/src/RLoopManager.cxx:720

It looks like we have a problem with re-entrant JITing, but I have absolutely no idea if that's supposed to work or points to a problem somewhere else @vgvassilev

vgvassilev and others added 22 commits December 9, 2022 07:31
Original commit message:"

[ORC] Fix weak hidden symbols failure on PPC with runtimedyld

Fix "JIT session error: Symbols not found: [ DW.ref.__gxx_personality_v0 ] error" which happens when trying to use exceptions on ppc linux. To do this, it expands AutoClaimSymbols option in RTDyldObjectLinkingLayer to also claim weak symbols before they are tried to be resovled. In ppc linux, DW.ref symbols is emitted as weak hidden symbols in the later stage of MC pipeline. This means when using IRLayer (i.e. LLJIT), IRLayer will not claim responsibility for such symbols and RuntimeDyld will skip defining this symbol even though it couldn't resolve corresponding external symbol.

Reviewed By: sgraenitz

Differential Revision: https://reviews.llvm.org/D129175
"

This patch fixes the same issue for ROOT on ppc64le.
Original commit message: "
Teach RuntimeDyld to handle COFF weak references and to consider comdat symbols as weak.

Patch by Lang Hames and Sunho Kim!
"

https://reviews.llvm.org/D138264
The relocation needs to allow for long offsets, as for the JIT, __dso_handle
might be outside the shared library. Fixes
```
cling JIT session error: In graph cling-module-10-jitted-objectbuffer, section __TEXT,__StaticInit: relocation target "___dso_handle" at address 0x7fe1ee5052e0 is out of range of Delta32 fixup at 0x108c410bd (___cxx_global_var_initcling_module_10_, 0x108c41090 + 0x2d)
[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { $.cling-module-10.__inits.0, __ZN12IncidentTypeL2m1E, __ZN6MarkerD2Ev, __ZN6MarkerD1Ev, ___cxx_global_var_initcling_module_10_.1, __GLOBAL__sub_I_cling_module_10, __ZN6MarkerC2EPKc, ___cxx_global_var_initcling_module_10_.3, __ZN12IncidentTypeL2m3E, __ZN6MarkerC1EPKc, __ZN12IncidentTypeL2m2E, ____orc_init_func.cling-module-10, ___cxx_global_var_initcling_module_10_ }) }
```
as seen no RISC-V and macOS, i.e. with the JITLinker.
Previously, one needed to pass linker-mangled names, which exposes details that
clients of IncrementalExecutor should not have to deal with. Instead, use the IR
name and do the linker-mangling in IncrementalJIT::addOrReplaceDefinition().

This fixes the lack of static destruction on macOS, visible e.g. in the test
failure of roottest/cling/staticinit/ROOT-7775.
This helps with debugging why a certain header does not end up in a
module that is currently generated.
It is used by core/base and causes laying violations with modules, see
module ROOT_Foundation_C which needs to contain ThreadLocalStorage.h.
This is still required for proper exception handling support. See
commits 3f74182 and a7b0b3e by Philippe for details on the
problem and the original code, which I was able to significantly
simplify with the new JIT infrastructure.

For the moment, this disables JITLink even on platforms that had it
active by default (notably macOS). This will be reintroduced at a
later point, which will require a memory manager implementation for
the JITLink interface.
This is required until CallFunc is informed about unloading, and can
re-generate the wrapper (if the decl is still available).
If the Decl pointers are not identical, declaresSameEntity will check
the canonical Decls. This fixes the df021_createTGraph tutorial on
CentOS 8.

Upstream discussion in https://reviews.llvm.org/D137787
It finds a false positive in llvm::DataLayout::reset accessing
DefaultAlignments while loading the Net library. At this point,
Cling and LLVM have definitely been initialized. This is a known
limitation of the order check, which is why it was disabled by
default. Remove our config to enable it for now to allow checking
the LLVM upgrade with AddressSanitizer instrumentation.
This is only used to determine the architecture for OpenMP offloading,
which we are not interested in.
As https://johannst.github.io/notes/development/symbolver.html points out,
"@@" just means "default version". We want to skip any versioned GCC/glibc/
libstdc++ symbol. Significantly reduces appetite for searching symbols.

Then also skip wek symbols: even if they are unresolved we will happily
not have them if they have not been loaded so far. Typical case is
`__gmon_start__`.
When creating orec-Symbols for dylib symbols, reloading the
dylib might mean a change in symbol (address). So unloading a
dylib means we need to unload the orc-Symbol.

This is implemented through resource-tracking the symbols as
provided by DynamicLibrarySearchGenerator. Actually, as
DynamicLibrarySearchGenerator does not support resource tracking,
it is implemented in a near-copy of DynamicLibrarySearchGenerator,
RTDynamicLibrarySearchGenerator, which uses the transaction of the
most recent module for the ResourceTracker.
The splitting code requires the full string to be null-terminated.
Before, random bytes were attached to the last option.
on macOS11, stressInterpreter fails with not re-emitting _ZL6strstrUa9enable_ifILb1EEPKcS0_.
It has internal linkage: originally deferred, it needs a chance to be re-emitted.
Simplify this to include anything that was deferred and is either weak-for-linker or
internal, to hopefully match all cases.
Copy link
Member

@Axel-Naumann Axel-Naumann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats everyone, with special thanks to @vgvassilev for driving this for so many months, and @weliveindetail @lhames @SimeonEhrig and @hahnjo for invaluable help!

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac12/noimt, mac11/cxx14, windows10/cxx14
How to customize builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.