Description
I found an interesting issue in LLD that I was chasing down for more than a year. The problem revolves around XInput1_4.dll
. I wanted to share how I got there, since others might see this problem before we land a fix.
First signs
At first, we wanted to use rust-lld.exe
along with our Rust monorepo. This worked locally...
(in .cargo/config.toml)
[target.x86_64-pc-windows-msvc]
linker = "rust-lld"
...but not remotely. Our local machines were on Windows 10 while the remote build system was on Windows Server 2019 cloud VMs. We got super-weird errors, and only on certain Rust crates. For example this one:
Another time, it was on the sqlx-macros
crate. Note its dependencies:
sqlx-macros v0.6.1 (proc-macro) (C:\src\github.com\launchbadge\sqlx\sqlx-macros)
├── dotenvy v0.15.1
│ └── dirs v4.0.0
│ └── dirs-sys v0.3.7
│ └── winapi v0.3.9
Since our monorepo uses workspaces, feature unification would kick in. We also happened to be experimenting with Bevy, so the the winapi
crate would be compiled with all the features used in the workspace, such as "xinput"
even though sqlx-macros
didn't really need it here.
Strangely the build only failed when running headless, through a SSH session. When a user was connected on the VM, the build would succeed. We managed to capture minidumps of the failure, and inspection was showing an error deep down some core Microsoft libraries. At first we suspected an ABI incompatibility on Windows Server 2019. However it seemed that it had to do with XINPUT1_4.dll
. Essentially the DLL load graph was rust.exe
-> sqlx_macros-XXX.dll
-> XINPUT1_4.DLL
-> inputhost.dll
-> CoreMessaging.dll
then a call to RaiseFailFastException()
.
I left this aside for a while, since build times weren't that much of an issue. We were planning on upgrading the VMs to Windows Server 2022 later that year, so I postponed the rust-lld
update.
Again
At some point I wanted to upgrade our C++ codebase to using clang-cl
and lld-link
. Things worked locally, although our unit tests failed on the preflight remotely on the VMs. They were failing on startup with code exit status 0xc0000409
. At first I thought this was a crash in our code caused by UB with clang-cl
(this happened to me at Ubisoft during our first clang-cl
integration). Strangely I wasn't getting this error locally. I got taken with other things, so I decided to look at this later on.
We ended up upgrading our VMs to Windows Server 2022, and during my spare time between other things, I tried preflighting my clang-cl
upgrade again. Same thing, same error. I was quite puzzled, and I was starting to rule out the ABI incompatibility I though I had with Rust at first. It was inconceivable that the same failure would be carried on several CRT updates and Windows OS upgrades.
Yet, again.
Months later (now) I got a new Windows 11 install. I wanted to test LLVM 18.0 and un-shelved again my clang-cl
upgrade for our C++ codebase. After fixing a number of compilation errors, on the first boot our program complained about not finding XInput1_3.dll
. Installing that in C:\Windows\System
fixed that particular issue. Now it was crashing on XInput1_4.dll
:

I did not immediately make the connection with the past issues I had with Rust compilation. There was something strange about this callstack but I couldn't put my finger on it. While looking for clues on the Internet, I stomped on this thread. I immediately made the connection with the Rust build issue I had a year ago! However there wasn't really a resolution, just a suggestion to replace XInput1_4.dll
with XInput1_3.dll
.
What.
Do what??
So yeah. Duplicating XInput1_3.dll
and renaming it to XInput1_4.dll
fixed the issue. 😒 No more crashes.
I just couldn't leave it to that, I decided to dig a bit more to understand the problem. Putting back XInput1_4.dll
as it was, restarting and stepping through the asm code manually reveled that it had something to do with calls that XInput was doing to the Windows Event Tracing API. It failed because one of the HANDLEs in the structure used to initialize the evntprov.h
API was already initialized (see the highlighted value in the Memory window).

How could that be? Was it a memory corruption? But the HANDLE looked legit, so that structure must have been already initialized somewhere else. Maybe the DLL was loaded elsewhere? Let's put a breakpoint in that function, see if it was called before.
Indeed:

Looking at the memory window, the location where the HANDLE is hasn't been written yet. Also the callstack shows a different codepath when entering DllMain
. I was able to confirm that this was happening after the call to LoadLibrary
, but before the entry point of my DLL was executed. Essentially, at this point the Windows OS loader was initializing the DLL dependency graph.
Now if I hit Run again, the program will stop at the same location a second time. That's where the bug happens. Looking at the callstack again, something fishy appears to me: why is the entry point of my DLL calling XInput1_4.DLL!DllMain()
? Looking at the assembly:

Looks legit. But then, stepping inside:

Wait a second. Why is that an import symbol??
That location is an IAT entry that points to XInput1_4.DLL!DllMain()
. At that point I was pretty sure that I was experiencing the side-effect of a bug in LLD. I recompiled LLD from source, quickly stepped through it while it was linking my DLL, and then it become clear what the problem was.
The bug and a repro
Essentially, XInput1_4.dll
exports its DllMain
symbol (why?). This seems to be a mistake on Microsoft's end. But luckily that triggers our bug in LLD 😄 XInput.lib
appears first on the command-line, then /DEFAULTLIB:msvcrt.lib
where the default (weak) DllMain
function is. Due to the lazy nature of how the inputs are parsed in LLD, the XInput1_4.dll!DllMain
symbol is placed first in the symbol table, while scanning all the inputs. Then comes the msvcrt.lib
DllMain
symbol which does not overwrite the symbol table entry, since it's already there. Later, when another function pulls on DllMain
, LLD pulls on the OBJ inside XInput.lib
, and replaces that symbol table entry with its (XInput import lib) import symbol.
The repro is quite straightforward:
// b1.cpp
extern "C" int f();
int b() {
return f();
}
// b2.cpp
extern "C" int f() {
return 0;
}
// a.cpp
extern "C" __declspec(dllexport) int f() {
return 1;
}
// main.cpp
extern int b();
int main() {
return b();
}
rem --- build.bat ---
cl /c /Z7 b1.cpp b2.cpp
lib /out:b.lib b1.obj b2.obj
cl /LD /Z7 a.cpp
cl /c /Z7 main.cpp
rem set LINKER=c:\git\llvm-project\stage1_msvc_debug\bin\lld-link
set LINKER=link
%LINKER% /debug /nodefaultlib /out:main.exe main.obj a.lib b.lib /entry:main
rem --- end of build.bat ---
// After building, with MSVC link.exe the output is:
> main.exe
> echo %ERRORLEVEL%
0
// With LLD the output is:
> main.exe
> echo %ERRORLEVEL%
1
MSVC somehow seems to prefer a fully defined symbol rather than an imported symbol. Another possibility is that it tends to favor a symbol inside the same .LIB rather than one defined elsewhere. At this point, if someone from the MSVC team could comment, that would be very helpful. I have a tentative fix, but I'm not sure what the "good" behavior should be.
We could categorize this as the "annoying" class of bugs with possibly a workaround. However given how annoying it was to find, it might be that other folks will see different inceptions of it, so ideally I'd prefer fixing it in LLD.