Description
Bugzilla Link | 51930 |
Version | unspecified |
OS | Windows NT |
CC | @JDevlieghere |
Extended Description
A variety of configuration and design decisions have resulted in a situation that, if you build for debug (i.e., -DCMAKE_BUILD_TYPE=Debug
) and try to execute the tests on Windows, most of them will fail.
Steps to Reproduce
cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS="clang;lld;lldb" -DLLVM_TARGETS_TO_BUILD=X86 ..\..\llvm-project\llvm
ninja check-lldb
Result
Over 900 of the tests will fail.
Many tests will give a Python stack trace with an access violation or this error:
ImportError: cannot import name 'lldb' from partially initialized module 'lldb' (most likely due to a circular import) (D:\src\llvm\build\ninja_dbg\Lib\site-packages\lldb_init.py)
(These are actually the best clue as to what's going on.)
For many more tests, the generated output is wrong, so apparent symptom is a filecheck miscompare. (If you try to reproduce these outside of lit, the output will likely be correct and the test will pass. See Workaround for details.)
A few tests hang.
Cause
Two different versions of the Python in the same process, each of which is built against a different version of the C run-time library DLLs.
Here's how this happens:
-
Ninja starts a process running Lit in the regular Python interpreter.
-
Lit starts a process to run dotest in Python interpreter.
-
The dotest.py script imports the lldb module.
-
The lldb module's SWIG-generated
__init__
in turn tries to import _lldb. -
In release builds _lldb is a Windows DLL called _lldb.pyd produced from the SWIG bindings. In debug builds, the DLL is called _lldb_d.pyd.
After SWIG 3.0.9, the template to generate the lldb module's
__init__
had to change a bit (because newer versions of SWIG required changes). As a result, the__init__
method no longer distinguishes between _lldb and _lldb_d.Our CMake builds originally adapted by creating a filesystem link from _lldb.pyd to the actual _lldb.pyd or _lldb_d.pyd as appropriate. This didn't work reliably (possibly because of differences in the implementations of symlink from GnuWin32 and git).
Nowadays the correct DLL is copied to _lldb.pyd. Note, however, that the copy and silently fail. See Notes.
-
Using the now-loaded lldb module to get the SBAPI, dotest.py creates and instance of LLDB (the actual debugger), which runs in the same process as dotest.py.
-
That LLDB instance has its own statically-linked Python interpreter embedded. Thus the process now has two instances of Python: one running dotest.py and one inside the LLDB instance.
If those two instances don't match, e.g., if one is "release" and the other is "debug", or one is 3.7 and the other 3.8, misery ensures.
Workaround
You can exercise the tests with a "release" build, but you will miss some bugs because release builds disable assertions in core llvm libraries.
For individual dotest.py tests, you can bypass Ninja and Lit and explicitly launch dotest.py in the debug version of Python (i.e., python_d.exe
instead of python.exe
). For example:
"C:/Program Files/Python38/python_d.exe" \
D:/src/llvm/llvm-project/lldb\test\API\dotest.py \
[options elided] \
-p TestDynamicValue.py
Solution
None found. I recommend we modify our CMake scripts to warn when CMAKE_BUILD_TYPE is Debug, the target platform is Windows, and LLVM_ENABLE_PROJECTS includes lldb.
Notes
Failure to Copy
The copy of either _lldb.pyd or _lldb_d.pyd to _lldb.pyd can fail. In particular, I've seen this happen when a zombie process from a previous test run holds the older file locked. For reasons I haven't discovered, failure of the copy doesn't fail the build. You're left with a previous build of _lldb.pyd, which can make for difficult-to-debug problems.
Python Detection Churn
In the past year or two, we've had a lot of churn in how CMake finds Python for llvm generally and for specifically for lldb. In hindsight, I think a lot of the problems I experienced with those changes were because the test process ended up with two different versions of Python, each linked against a different version of the CRT, even when they were both release builds.