Windows GNU toolchain with -Cinstrument-coverage generates invalid data #111098
Description
The profraw output of a binary compiled with the x86_64-pc-windows-gnu toolchain is considered "malformed instrumentation profile data" by llvm-profdata. In addition, the relevant coverage mapping sections (namely .lprfn, .lcovfun) within the EXE do not match expectations of llvm-cov. Both problems are presumably caused by linker issues concering COFF file handling in the GNU toolchain.
Given the simple demo program in the attachment, the following works just fine with the MSVC toolchain:
rustc +nightly-x86_64-pc-windows-msvc -Cinstrument-coverage main.rs -o main_msvc.exe
$Env:LLVM_PROFILE_FILE="msvc.profraw"
.\main_msvc.exe
$msvc_sysroot=$(rustc +nightly-x86_64-pc-windows-msvc --print sysroot)
$msvc_tools=$msvc_sysroot + "\lib\rustlib\x86_64-pc-windows-msvc\bin\"
&$msvc_tools\llvm-profdata.exe merge msvc.profraw -o msvc.profdata
&$msvc_tools\llvm-cov show main_msvc.exe --instr-profile msvc.profdata
This outputs correct coverage information. I expected the same behavior from the GNU toolchain:
rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage main.rs -o main_gnu.exe
$Env:LLVM_PROFILE_FILE="gnu.profraw"
.\main_gnu.exe
$gnu_sysroot=$(rustc +nightly-x86_64-pc-windows-gnu --print sysroot)
$gnu_tools=$gnu_sysroot + "\lib\rustlib\x86_64-pc-windows-gnu\bin\"
&$gnu_tools\llvm-profdata.exe merge gnu.profraw -o gnu.profdata
But llvm-profdata failed:
warning: gnu.profraw: malformed instrumentation profile data
error: no profile can be merged
(Sidenote: I had to copy libgcc_s_seh-1.dll and libwinpthread-1.dll into the same folder as llvm-profdata or otherwise the GNU version of the tool fails to launch if called directly - which resulted in a confusing error when called indirectly by grcov)
Investigation
The generated gnu.profraw is shorter and lacks the function names:
msvc.profraw:
00000000 81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00 �rforpl.........
00000010 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 ................
00000020 00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00 ................
00000030 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 ........�.......
00000040 79 FF FF FF 00 00 00 00 01 40 50 1B F6 7F 00 00 y........@P.ö⌂..
00000050 01 00 00 00 00 00 00 00 F0 0D 72 91 B9 C4 E0 D1 ........ð.r�¹ÄàÑ
00000060 AD C7 B1 32 70 9C 78 B2 80 FF FF FF 00 00 00 00 DZ2p�x²�.......
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000080 02 00 00 00 00 00 00 00 01 43 74 2F CF 00 F7 D8 .........Ct/Ï.÷Ø
00000090 99 39 35 49 69 5E 03 A1 60 FF FF FF 00 00 00 00 �95Ii^.¡`.......
000000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000B0 02 00 00 00 00 00 00 00 4B 91 E1 4B B9 7A D0 85 ........K�áK¹zÐ�
000000C0 93 0D 62 81 11 C9 A3 F1 40 FF FF FF 00 00 00 00 �.b�.É£ñ@.......
000000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000E0 02 00 00 00 00 00 00 00 5F 86 D0 72 69 5A D5 B4 ........_�ÐriZÕ´
000000F0 3D 7D A8 8E 0F CC D2 0D 20 FF FF FF 00 00 00 00 =}¨�.ÌÒ. .......
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000110 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000160 00 00 00 00 00 00 00 00 54 00 5F 52 4E 76 43 73 ........T._RNvCs
00000170 64 53 61 6B 56 33 65 42 31 79 58 5F 34 6D 61 69 dSakV3eB1yX_4mai
00000180 6E 33 66 6F 6F 01 5F 52 4E 76 43 73 64 53 61 6B n3foo._RNvCsdSak
00000190 56 33 65 42 31 79 58 5F 34 6D 61 69 6E 33 62 61 V3eB1yX_4main3ba
000001A0 72 01 5F 52 4E 76 43 73 64 53 61 6B 56 33 65 42 r._RNvCsdSakV3eB
000001B0 31 79 58 5F 34 6D 61 69 6E 34 6D 61 69 6E 28 00 1yX_4main4main(.
000001C0 5F 52 4E 76 4E 74 43 73 64 53 61 6B 56 33 65 42 _RNvNtCsdSakV3eB
000001D0 31 79 58 5F 34 6D 61 69 6E 35 73 74 75 66 66 38 1yX_4main5stuff8
000001E0 64 6F 5F 73 74 75 66 66 do_stuff
gnu.profraw:
00000000 81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00 �rforpl.........
00000010 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
00000020 00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00 ................
00000030 00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 ................
00000040 D1 CF FF FF 00 00 00 00 01 B0 29 39 F6 7F 00 00 ÑÏ.......°)9ö⌂..
00000050 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
000000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
000000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
000000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
The strings appear if -Clink-dead-code
is passed:
rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage -Clink-dead-code main.rs -o main_gnu_ldc.exe
$Env:LLVM_PROFILE_FILE="gnu_ldc.profraw"
.\main_gnu_ldc.exe
gnu_ldc.profraw:
00000000 81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00 �rforpl.........
00000010 00 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00 ................
00000020 00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00 ................
00000030 00 00 00 00 00 00 00 00 83 00 00 00 00 00 00 00 ........�.......
00000040 D1 EF FF FF 00 00 00 00 01 50 19 50 F6 7F 00 00 Ñï.......P.Pö⌂..
00000050 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000060 00 00 00 00 00 00 00 00 F0 0D 72 91 B9 C4 E0 D1 ........ð.r�¹ÄàÑ
00000070 33 7C 1A DC 65 39 44 59 C8 EF FF FF 00 00 00 00 3|.Üe9DYÈï......
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000090 02 00 00 00 00 00 00 00 01 43 74 2F CF 00 F7 D8 .........Ct/Ï.÷Ø
000000A0 E1 31 81 8A 97 17 AA BA A8 EF FF FF 00 00 00 00 á1���.ªº¨ï......
000000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000C0 02 00 00 00 00 00 00 00 4B 91 E1 4B B9 7A D0 85 ........K�áK¹zÐ�
000000D0 D9 EE E3 C6 DB 92 8F 50 88 EF FF FF 00 00 00 00 ÙîãÆÛ��P�ï......
000000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000F0 02 00 00 00 00 00 00 00 5F 86 D0 72 69 5A D5 B4 ........_�ÐriZÕ´
00000100 3D 7D A8 8E 0F CC D2 0D 68 EF FF FF 00 00 00 00 =}¨�.ÌÒ.hï......
00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000120 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 ................
00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000190 00 00 00 00 00 00 00 00 00 00 00 54 00 5F 52 4E ...........T._RN
000001A0 76 43 73 64 53 61 6B 56 33 65 42 31 79 58 5F 34 vCsdSakV3eB1yX_4
000001B0 6D 61 69 6E 33 66 6F 6F 01 5F 52 4E 76 43 73 64 main3foo._RNvCsd
000001C0 53 61 6B 56 33 65 42 31 79 58 5F 34 6D 61 69 6E SakV3eB1yX_4main
000001D0 33 62 61 72 01 5F 52 4E 76 43 73 64 53 61 6B 56 3bar._RNvCsdSakV
000001E0 33 65 42 31 79 58 5F 34 6D 61 69 6E 34 6D 61 69 3eB1yX_4main4mai
000001F0 6E 28 00 5F 52 4E 76 4E 74 43 73 64 53 61 6B 56 n(._RNvNtCsdSakV
00000200 33 65 42 31 79 58 5F 34 6D 61 69 6E 35 73 74 75 3eB1yX_4main5stu
00000210 66 66 38 64 6F 5F 73 74 75 66 66 00 00 00 00 00 ff8do_stuff.....
But this file is still considered malformed by llvm-profdata. Looking at the .lprfn section in the binary:
&$msvc_tools\llvm-objdump.exe -s -j '.lprfn' main_msvc.exe
main_msvc.exe: file format coff-x86-64
Contents of section .lprfn:
140034000 0054005f 524e7643 73645361 6b563365 .T._RNvCsdSakV3e
140034010 42317958 5f346d61 696e3366 6f6f015f B1yX_4main3foo._
140034020 524e7643 73645361 6b563365 42317958 RNvCsdSakV3eB1yX
140034030 5f346d61 696e3362 6172015f 524e7643 _4main3bar._RNvC
140034040 73645361 6b563365 42317958 5f346d61 sdSakV3eB1yX_4ma
140034050 696e346d 61696e28 005f524e 764e7443 in4main(._RNvNtC
140034060 73645361 6b563365 42317958 5f346d61 sdSakV3eB1yX_4ma
140034070 696e3573 74756666 38646f5f 73747566 in5stuff8do_stuf
140034080 6600 f.
&$gnu_tools\llvm-objdump.exe -s -j '.lprfn' main_gnu_ldc.exe
main_gnu_ldc.exe: file format coff-x86-64
Contents of section .lprfn:
140105000 00000000 54005f52 4e764373 6453616b ....T._RNvCsdSak
140105010 56336542 3179585f 346d6169 6e33666f V3eB1yX_4main3fo
140105020 6f015f52 4e764373 6453616b 56336542 o._RNvCsdSakV3eB
140105030 3179585f 346d6169 6e336261 72015f52 1yX_4main3bar._R
140105040 4e764373 6453616b 56336542 3179585f NvCsdSakV3eB1yX_
140105050 346d6169 6e346d61 696e2800 5f524e76 4main4main(._RNv
140105060 4e744373 6453616b 56336542 3179585f NtCsdSakV3eB1yX_
140105070 346d6169 6e357374 75666638 646f5f73 4main5stuff8do_s
140105080 74756666 00000000 tuff....
The GNU version adds three extra zero bytes at the front and the end compared to the MSVC version.
Both the computation in InstrProfilingPlatformWindows.c
const char COMPILER_RT_SECTION(".lprfn$A") NamesStart = '\0';
const char COMPILER_RT_SECTION(".lprfn$Z") NamesEnd = '\0';
const char *__llvm_profile_begin_names(void) { return &NamesStart + 1; }
const char *__llvm_profile_end_names(void) { return &NamesEnd; }
and the llvm-cov tool (CoverageMappingReader.cpp
) expect exactly one padding byte at the front.
// If this is a linked PE/COFF file, then we have to skip over the null byte
// that is allocated in the .lprfn$A section in the LLVM profiling runtime.
const ObjectFile *Obj = Section.getObject();
if (isa<COFFObjectFile>(Obj) && !Obj->isRelocatableObject())
Data = Data.drop_front(1);
Apparently, the extra padding bytes are generated because GCC outputs 4 byte sections for a single char.
&$gnu_tools\llvm-objdump.exe --all-headers $libprofiler_gnu | Select-String "\.lprf"
35 .lprfnd$Z 00000020 0000000000000000 DATA
36 .lprfnd$A 00000020 0000000000000000 DATA
38 .lprfc$Z 00000004 0000000000000000 DATA
39 .lprfc$A 00000004 0000000000000000 DATA
40 .lprfn$Z 00000004 0000000000000000 DATA
41 .lprfn$A 00000004 0000000000000000 DATA
42 .lprfd$Z 00000040 0000000000000000 DATA
43 .lprfd$A 00000040 0000000000000000 DATA
while MSVC generates a one byte section:
&$msvc_tools\llvm-objdump.exe --all-headers $libprofiler_msvc | Select-String "\.lprf"
13 .lprfd$A 00000030 0000000000000000 DATA
14 .lprfd$Z 00000030 0000000000000000 DATA
15 .lprfn$A 00000001 0000000000000000 DATA
16 .lprfn$Z 00000001 0000000000000000 DATA
17 .lprfc$A 00000001 0000000000000000 DATA
18 .lprfc$Z 00000001 0000000000000000 DATA
20 .lprfnd$A 00000018 0000000000000000 DATA
21 .lprfnd$Z 00000018 0000000000000000 DATA
Also note that we have a comparable issue with the lprfd, that should be 48 bytes sizeof(__llvm_profile_data)
but is allocated in a 64-byte section by GCC.
I was able to work around this, by building a profiler_builtins out-of-tree (simlar to minicov) and providing the $A and $Z sections/symbols from rust code. I did not find a switch in GCC to generate properly sized sections, but I may have missed it.
This way, the llvm-profdata accepted the generated profraw and llvm-cov accepted the EXE file. The resulting coverage was not satisfying, as it contained only coverage information for a single section.
The MSVC generated 'lcovfun' section desribes all four functions:
&$msvc_tools\llvm-objdump.exe -s -j '.lcovfun' main_msvc.exe
main_msvc.exe: file format coff-x86-64
Contents of section .lcovfun:
140032000 f00d7291 b9c4e0d1 09000000 adc7b132 ..r............2
140032010 709c78b2 7a2f4ca3 0f7973eb 01010001 p.x.z/L..ys.....
140032020 01030102 02000000 4b91e14b b97ad085 ........K..K.z..
140032030 09000000 930d6281 11c9a3f1 7a2f4ca3 ......b.....z/L.
140032040 0f7973eb 01010001 010a0102 02000000 .ys.............
140032050 0143742f cf00f7d8 09000000 99393549 .Ct/.........95I
140032060 695e03a1 7a2f4ca3 0f7973eb 01010001 i^..z/L..ys.....
140032070 01060102 02000000 5f86d072 695ad5b4 ........_..riZ..
140032080 1c000000 3d7da88e 0fccd20d 538568a3 ....=}......S.h.
140032090 a07b3068 01010201 05050204 01030101 .{0h............
1400320a0 09050209 000e0202 09000e07 02010002 ................
and the four functions are also present in the GNU generated object file:
rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage -Clink-dead-code --emit=obj main.rs -o main_gnu_ldc.o
&$gnu_tools\llvm-objdump.exe -s -j '.lcovfun$M' main_gnu_ldc.o
main_gnu_ldc.o: file format coff-x86-64
Contents of section .lcovfun$M:
0000 4b91e14b b97ad085 09000000 d9eee3c6 K..K.z..........
0010 db928f50 73a65812 005f97d4 01010001 ...Ps.X.._......
0020 010a0102 02 .....
Contents of section .lcovfun$M:
0000 5f86d072 695ad5b4 1c000000 3d7da88e _..riZ......=}..
0010 0fccd20d 73a65812 005f97d4 01020201 ....s.X.._......
0020 05050204 01030101 09050209 000e0202 ................
0030 09000e07 02010002 ........
Contents of section .lcovfun$M:
0000 f00d7291 b9c4e0d1 09000000 337c1adc ..r.........3|..
0010 65394459 73a65812 005f97d4 01010001 e9DYs.X.._......
0020 01030102 02 .....
Contents of section .lcovfun$M:
0000 0143742f cf00f7d8 09000000 e131818a .Ct/.........1..
0010 9717aaba 73a65812 005f97d4 01010001 ....s.X.._......
0020 01060102 02 .....
and each of the sections has a symbol with a unique name (__covrec_B4D55A6972D0865Fu = linkonce_odr hidden constant, ...) but the linker still only keeps a single symbol in the generated binary:
&$gnu_tools\llvm-objdump.exe -s -j '.lcovfun' main_gnu_ldc.exe
main_gnu_ldc.exe: file format coff-x86-64
Contents of section .lcovfun:
140103000 f00d7291 b9c4e0d1 09000000 337c1adc ..r.........3|..
140103010 65394459 7a2f4ca3 0f7973eb 01010001 e9DYz/L..ys.....
140103020 01030102 02
I did not find a solution for the linker dropping the other covrecs.
Meta
rustc +nightly-x86_64-pc-windows-msvc --version --verbose
:
rustc 1.71.0-nightly (b628260df 2023-04-22)
binary: rustc
commit-hash: b628260df0587ae559253d8640ecb8738d3de613
commit-date: 2023-04-22
host: x86_64-pc-windows-msvc
release: 1.71.0-nightly
LLVM version: 16.0.2
rustc +nightly-x86_64-pc-windows-gnu --version --verbose
:
rustc 1.71.0-nightly (b628260df 2023-04-22)
binary: rustc
commit-hash: b628260df0587ae559253d8640ecb8738d3de613
commit-date: 2023-04-22
host: x86_64-pc-windows-gnu
release: 1.71.0-nightly
LLVM version: 16.0.2
Activity