Skip to content

Windows GNU toolchain with -Cinstrument-coverage generates invalid data #111098

Open
@jmetter

Description

The profraw output of a binary compiled with the x86_64-pc-windows-gnu toolchain is considered "malformed instrumentation profile data" by llvm-profdata. In addition, the relevant coverage mapping sections (namely .lprfn, .lcovfun) within the EXE do not match expectations of llvm-cov. Both problems are presumably caused by linker issues concering COFF file handling in the GNU toolchain.

Given the simple demo program in the attachment, the following works just fine with the MSVC toolchain:

rustc +nightly-x86_64-pc-windows-msvc -Cinstrument-coverage main.rs -o main_msvc.exe
$Env:LLVM_PROFILE_FILE="msvc.profraw"
.\main_msvc.exe
$msvc_sysroot=$(rustc +nightly-x86_64-pc-windows-msvc --print sysroot)
$msvc_tools=$msvc_sysroot + "\lib\rustlib\x86_64-pc-windows-msvc\bin\"
&$msvc_tools\llvm-profdata.exe merge msvc.profraw -o msvc.profdata
&$msvc_tools\llvm-cov show main_msvc.exe --instr-profile msvc.profdata

This outputs correct coverage information. I expected the same behavior from the GNU toolchain:

rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage main.rs -o main_gnu.exe
$Env:LLVM_PROFILE_FILE="gnu.profraw"
.\main_gnu.exe
$gnu_sysroot=$(rustc +nightly-x86_64-pc-windows-gnu --print sysroot)
$gnu_tools=$gnu_sysroot + "\lib\rustlib\x86_64-pc-windows-gnu\bin\"
&$gnu_tools\llvm-profdata.exe merge gnu.profraw -o gnu.profdata

But llvm-profdata failed:

warning: gnu.profraw: malformed instrumentation profile data
error: no profile can be merged

(Sidenote: I had to copy libgcc_s_seh-1.dll and libwinpthread-1.dll into the same folder as llvm-profdata or otherwise the GNU version of the tool fails to launch if called directly - which resulted in a confusing error when called indirectly by grcov)

Investigation

The generated gnu.profraw is shorter and lacks the function names:

msvc.profraw:

00000000   81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00  �rforpl.........
00000010   00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00  ................
00000020   00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00  ................
00000030   00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00  ........�.......
00000040   79 FF FF FF 00 00 00 00 01 40 50 1B F6 7F 00 00  y........@P.ö⌂..
00000050   01 00 00 00 00 00 00 00 F0 0D 72 91 B9 C4 E0 D1  ........ð.r�¹ÄàÑ
00000060   AD C7 B1 32 70 9C 78 B2 80 FF FF FF 00 00 00 00  ­Ç±2p�x²�.......
00000070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000080   02 00 00 00 00 00 00 00 01 43 74 2F CF 00 F7 D8  .........Ct/Ï.÷Ø
00000090   99 39 35 49 69 5E 03 A1 60 FF FF FF 00 00 00 00  �95Ii^.¡`.......
000000A0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000B0   02 00 00 00 00 00 00 00 4B 91 E1 4B B9 7A D0 85  ........K�áK¹zÐ�
000000C0   93 0D 62 81 11 C9 A3 F1 40 FF FF FF 00 00 00 00  �.b�.É£ñ@.......
000000D0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000E0   02 00 00 00 00 00 00 00 5F 86 D0 72 69 5A D5 B4  ........_�ÐriZÕ´
000000F0   3D 7D A8 8E 0F CC D2 0D 20 FF FF FF 00 00 00 00  =}¨�.ÌÒ. .......
00000100   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000110   03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000120   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000130   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000140   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000150   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000160   00 00 00 00 00 00 00 00 54 00 5F 52 4E 76 43 73  ........T._RNvCs
00000170   64 53 61 6B 56 33 65 42 31 79 58 5F 34 6D 61 69  dSakV3eB1yX_4mai
00000180   6E 33 66 6F 6F 01 5F 52 4E 76 43 73 64 53 61 6B  n3foo._RNvCsdSak
00000190   56 33 65 42 31 79 58 5F 34 6D 61 69 6E 33 62 61  V3eB1yX_4main3ba
000001A0   72 01 5F 52 4E 76 43 73 64 53 61 6B 56 33 65 42  r._RNvCsdSakV3eB
000001B0   31 79 58 5F 34 6D 61 69 6E 34 6D 61 69 6E 28 00  1yX_4main4main(.
000001C0   5F 52 4E 76 4E 74 43 73 64 53 61 6B 56 33 65 42  _RNvNtCsdSakV3eB
000001D0   31 79 58 5F 34 6D 61 69 6E 35 73 74 75 66 66 38  1yX_4main5stuff8
000001E0   64 6F 5F 73 74 75 66 66                          do_stuff

gnu.profraw:

00000000   81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00  �rforpl.........
00000010   00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
00000020   00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00  ................
00000030   00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00  ................
00000040   D1 CF FF FF 00 00 00 00 01 B0 29 39 F6 7F 00 00  ÑÏ.......°)9ö⌂..
00000050   01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000090   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
000000A0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
000000B0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
000000C0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000D0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

The strings appear if -Clink-dead-code is passed:

rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage -Clink-dead-code main.rs -o main_gnu_ldc.exe
$Env:LLVM_PROFILE_FILE="gnu_ldc.profraw"
.\main_gnu_ldc.exe

gnu_ldc.profraw:

00000000   81 72 66 6F 72 70 6C FF 08 00 00 00 00 00 00 00  �rforpl.........
00000010   00 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00  ................
00000020   00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00  ................
00000030   00 00 00 00 00 00 00 00 83 00 00 00 00 00 00 00  ........�.......
00000040   D1 EF FF FF 00 00 00 00 01 50 19 50 F6 7F 00 00  Ñï.......P.Pö⌂..
00000050   01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060   00 00 00 00 00 00 00 00 F0 0D 72 91 B9 C4 E0 D1  ........ð.r�¹ÄàÑ
00000070   33 7C 1A DC 65 39 44 59 C8 EF FF FF 00 00 00 00  3|.Üe9DYÈï......
00000080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000090   02 00 00 00 00 00 00 00 01 43 74 2F CF 00 F7 D8  .........Ct/Ï.÷Ø
000000A0   E1 31 81 8A 97 17 AA BA A8 EF FF FF 00 00 00 00  á1���.ªº¨ï......
000000B0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000C0   02 00 00 00 00 00 00 00 4B 91 E1 4B B9 7A D0 85  ........K�áK¹zÐ�
000000D0   D9 EE E3 C6 DB 92 8F 50 88 EF FF FF 00 00 00 00  ÙîãÆÛ��P�ï......
000000E0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000F0   02 00 00 00 00 00 00 00 5F 86 D0 72 69 5A D5 B4  ........_�ÐriZÕ´
00000100   3D 7D A8 8E 0F CC D2 0D 68 EF FF FF 00 00 00 00  =}¨�.ÌÒ.hï......
00000110   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000120   03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000130   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000140   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000150   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000160   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000170   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
00000180   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000190   00 00 00 00 00 00 00 00 00 00 00 54 00 5F 52 4E  ...........T._RN
000001A0   76 43 73 64 53 61 6B 56 33 65 42 31 79 58 5F 34  vCsdSakV3eB1yX_4
000001B0   6D 61 69 6E 33 66 6F 6F 01 5F 52 4E 76 43 73 64  main3foo._RNvCsd
000001C0   53 61 6B 56 33 65 42 31 79 58 5F 34 6D 61 69 6E  SakV3eB1yX_4main
000001D0   33 62 61 72 01 5F 52 4E 76 43 73 64 53 61 6B 56  3bar._RNvCsdSakV
000001E0   33 65 42 31 79 58 5F 34 6D 61 69 6E 34 6D 61 69  3eB1yX_4main4mai
000001F0   6E 28 00 5F 52 4E 76 4E 74 43 73 64 53 61 6B 56  n(._RNvNtCsdSakV
00000200   33 65 42 31 79 58 5F 34 6D 61 69 6E 35 73 74 75  3eB1yX_4main5stu
00000210   66 66 38 64 6F 5F 73 74 75 66 66 00 00 00 00 00  ff8do_stuff.....

But this file is still considered malformed by llvm-profdata. Looking at the .lprfn section in the binary:

&$msvc_tools\llvm-objdump.exe -s -j '.lprfn' main_msvc.exe
main_msvc.exe:  file format coff-x86-64
Contents of section .lprfn:
 140034000 0054005f 524e7643 73645361 6b563365  .T._RNvCsdSakV3e
 140034010 42317958 5f346d61 696e3366 6f6f015f  B1yX_4main3foo._
 140034020 524e7643 73645361 6b563365 42317958  RNvCsdSakV3eB1yX
 140034030 5f346d61 696e3362 6172015f 524e7643  _4main3bar._RNvC
 140034040 73645361 6b563365 42317958 5f346d61  sdSakV3eB1yX_4ma
 140034050 696e346d 61696e28 005f524e 764e7443  in4main(._RNvNtC
 140034060 73645361 6b563365 42317958 5f346d61  sdSakV3eB1yX_4ma
 140034070 696e3573 74756666 38646f5f 73747566  in5stuff8do_stuf
 140034080 6600                                 f.
&$gnu_tools\llvm-objdump.exe -s -j '.lprfn' main_gnu_ldc.exe
main_gnu_ldc.exe:       file format coff-x86-64
Contents of section .lprfn:
 140105000 00000000 54005f52 4e764373 6453616b  ....T._RNvCsdSak
 140105010 56336542 3179585f 346d6169 6e33666f  V3eB1yX_4main3fo
 140105020 6f015f52 4e764373 6453616b 56336542  o._RNvCsdSakV3eB
 140105030 3179585f 346d6169 6e336261 72015f52  1yX_4main3bar._R
 140105040 4e764373 6453616b 56336542 3179585f  NvCsdSakV3eB1yX_
 140105050 346d6169 6e346d61 696e2800 5f524e76  4main4main(._RNv
 140105060 4e744373 6453616b 56336542 3179585f  NtCsdSakV3eB1yX_
 140105070 346d6169 6e357374 75666638 646f5f73  4main5stuff8do_s
 140105080 74756666 00000000                    tuff....

The GNU version adds three extra zero bytes at the front and the end compared to the MSVC version.

Both the computation in InstrProfilingPlatformWindows.c

const char COMPILER_RT_SECTION(".lprfn$A") NamesStart = '\0';
const char COMPILER_RT_SECTION(".lprfn$Z") NamesEnd = '\0';

const char *__llvm_profile_begin_names(void) { return &NamesStart + 1; }
const char *__llvm_profile_end_names(void) { return &NamesEnd; }

and the llvm-cov tool (CoverageMappingReader.cpp) expect exactly one padding byte at the front.

  // If this is a linked PE/COFF file, then we have to skip over the null byte
  // that is allocated in the .lprfn$A section in the LLVM profiling runtime.
  const ObjectFile *Obj = Section.getObject();
  if (isa<COFFObjectFile>(Obj) && !Obj->isRelocatableObject())
    Data = Data.drop_front(1);

Apparently, the extra padding bytes are generated because GCC outputs 4 byte sections for a single char.

&$gnu_tools\llvm-objdump.exe --all-headers $libprofiler_gnu | Select-String "\.lprf"
 35 .lprfnd$Z                             00000020 0000000000000000 DATA
 36 .lprfnd$A                             00000020 0000000000000000 DATA
 38 .lprfc$Z                              00000004 0000000000000000 DATA
 39 .lprfc$A                              00000004 0000000000000000 DATA
 40 .lprfn$Z                              00000004 0000000000000000 DATA
 41 .lprfn$A                              00000004 0000000000000000 DATA
 42 .lprfd$Z                              00000040 0000000000000000 DATA
 43 .lprfd$A                              00000040 0000000000000000 DATA

while MSVC generates a one byte section:

&$msvc_tools\llvm-objdump.exe --all-headers $libprofiler_msvc | Select-String "\.lprf"
 13 .lprfd$A      00000030 0000000000000000 DATA
 14 .lprfd$Z      00000030 0000000000000000 DATA
 15 .lprfn$A      00000001 0000000000000000 DATA
 16 .lprfn$Z      00000001 0000000000000000 DATA
 17 .lprfc$A      00000001 0000000000000000 DATA
 18 .lprfc$Z      00000001 0000000000000000 DATA
 20 .lprfnd$A     00000018 0000000000000000 DATA
 21 .lprfnd$Z     00000018 0000000000000000 DATA

Also note that we have a comparable issue with the lprfd, that should be 48 bytes sizeof(__llvm_profile_data) but is allocated in a 64-byte section by GCC.

I was able to work around this, by building a profiler_builtins out-of-tree (simlar to minicov) and providing the $A and $Z sections/symbols from rust code. I did not find a switch in GCC to generate properly sized sections, but I may have missed it.

This way, the llvm-profdata accepted the generated profraw and llvm-cov accepted the EXE file. The resulting coverage was not satisfying, as it contained only coverage information for a single section.

The MSVC generated 'lcovfun' section desribes all four functions:

&$msvc_tools\llvm-objdump.exe -s -j '.lcovfun' main_msvc.exe
main_msvc.exe:  file format coff-x86-64
Contents of section .lcovfun:
 140032000 f00d7291 b9c4e0d1 09000000 adc7b132  ..r............2
 140032010 709c78b2 7a2f4ca3 0f7973eb 01010001  p.x.z/L..ys.....
 140032020 01030102 02000000 4b91e14b b97ad085  ........K..K.z..
 140032030 09000000 930d6281 11c9a3f1 7a2f4ca3  ......b.....z/L.
 140032040 0f7973eb 01010001 010a0102 02000000  .ys.............
 140032050 0143742f cf00f7d8 09000000 99393549  .Ct/.........95I
 140032060 695e03a1 7a2f4ca3 0f7973eb 01010001  i^..z/L..ys.....
 140032070 01060102 02000000 5f86d072 695ad5b4  ........_..riZ..
 140032080 1c000000 3d7da88e 0fccd20d 538568a3  ....=}......S.h.
 140032090 a07b3068 01010201 05050204 01030101  .{0h............
 1400320a0 09050209 000e0202 09000e07 02010002  ................

and the four functions are also present in the GNU generated object file:

rustc +nightly-x86_64-pc-windows-gnu -Cinstrument-coverage -Clink-dead-code --emit=obj main.rs -o main_gnu_ldc.o
&$gnu_tools\llvm-objdump.exe -s -j '.lcovfun$M' main_gnu_ldc.o
main_gnu_ldc.o: file format coff-x86-64
Contents of section .lcovfun$M:
 0000 4b91e14b b97ad085 09000000 d9eee3c6  K..K.z..........
 0010 db928f50 73a65812 005f97d4 01010001  ...Ps.X.._......
 0020 010a0102 02                          .....
Contents of section .lcovfun$M:
 0000 5f86d072 695ad5b4 1c000000 3d7da88e  _..riZ......=}..
 0010 0fccd20d 73a65812 005f97d4 01020201  ....s.X.._......
 0020 05050204 01030101 09050209 000e0202  ................
 0030 09000e07 02010002                    ........
Contents of section .lcovfun$M:
 0000 f00d7291 b9c4e0d1 09000000 337c1adc  ..r.........3|..
 0010 65394459 73a65812 005f97d4 01010001  e9DYs.X.._......
 0020 01030102 02                          .....
Contents of section .lcovfun$M:
 0000 0143742f cf00f7d8 09000000 e131818a  .Ct/.........1..
 0010 9717aaba 73a65812 005f97d4 01010001  ....s.X.._......
 0020 01060102 02                          .....

and each of the sections has a symbol with a unique name (__covrec_B4D55A6972D0865Fu = linkonce_odr hidden constant, ...) but the linker still only keeps a single symbol in the generated binary:

&$gnu_tools\llvm-objdump.exe -s -j '.lcovfun' main_gnu_ldc.exe
main_gnu_ldc.exe:       file format coff-x86-64
Contents of section .lcovfun:
 140103000 f00d7291 b9c4e0d1 09000000 337c1adc  ..r.........3|..
 140103010 65394459 7a2f4ca3 0f7973eb 01010001  e9DYz/L..ys.....
 140103020 01030102 02

I did not find a solution for the linker dropping the other covrecs.

Meta

rustc +nightly-x86_64-pc-windows-msvc --version --verbose:

rustc 1.71.0-nightly (b628260df 2023-04-22)
binary: rustc
commit-hash: b628260df0587ae559253d8640ecb8738d3de613
commit-date: 2023-04-22
host: x86_64-pc-windows-msvc
release: 1.71.0-nightly
LLVM version: 16.0.2

rustc +nightly-x86_64-pc-windows-gnu --version --verbose:

rustc 1.71.0-nightly (b628260df 2023-04-22)
binary: rustc
commit-hash: b628260df0587ae559253d8640ecb8738d3de613
commit-date: 2023-04-22
host: x86_64-pc-windows-gnu
release: 1.71.0-nightly
LLVM version: 16.0.2

demo.zip

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-code-coverageArea: Source-based code coverage (-Cinstrument-coverage)C-bugCategory: This is a bug.O-windows-gnuToolchain: GNU, Operating system: Windows

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions