Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reland: [llvm-cov][WebAssembly] Read __llvm_prf_names from data segments #112569

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kateinoigakukun
Copy link
Member

On WebAssembly, most coverage metadata contents read by llvm-cov (like __llvm_covmap and __llvm_covfun) are stored in custom sections because they are not referenced at runtime. However, __llvm_prf_names is referenced at runtime by the profile runtime library and is read by llvm-cov post-processing tools, so it needs to be stored in a data segment, which is allocatable at runtime and accessible by tools as long as "name" section is present in the binary.

This patch changes the way llvm-cov reads __llvm_prf_names on WebAssembly. Instead of looking for a section, it looks for a data segment with the same name.

This reverts commit 157f10d and fixes PE/COFF .lprfn$A section handling.

…ments

On WebAssembly, most coverage metadata contents read by llvm-cov (like
`__llvm_covmap` and `__llvm_covfun`) are stored in custom sections
because they are not referenced at runtime. However, `__llvm_prf_names`
is referenced at runtime by the profile runtime library and is read
by llvm-cov post-processing tools, so it needs to be stored in a data
segment, which is allocatable at runtime and accessible by tools as long
as "name" section is present in the binary.

This patch changes the way llvm-cov reads `__llvm_prf_names` on
WebAssembly. Instead of looking for a section, it looks for a data
segment with the same name.

This reverts commit 157f10d and fixes
PE/COFF `.lprfn$A` section handling.
@kateinoigakukun kateinoigakukun marked this pull request as ready for review October 16, 2024 18:56
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Oct 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-pgo

Author: Yuta Saito (kateinoigakukun)

Changes

On WebAssembly, most coverage metadata contents read by llvm-cov (like __llvm_covmap and __llvm_covfun) are stored in custom sections because they are not referenced at runtime. However, __llvm_prf_names is referenced at runtime by the profile runtime library and is read by llvm-cov post-processing tools, so it needs to be stored in a data segment, which is allocatable at runtime and accessible by tools as long as "name" section is present in the binary.

This patch changes the way llvm-cov reads __llvm_prf_names on WebAssembly. Instead of looking for a section, it looks for a data segment with the same name.

This reverts commit 157f10d and fixes PE/COFF .lprfn$A section handling.


Full diff: https://github.com/llvm/llvm-project/pull/112569.diff

4 Files Affected:

  • (modified) llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp (+74-18)
  • (added) llvm/test/tools/llvm-cov/Inputs/binary-formats.v6.wasm32 ()
  • (added) llvm/test/tools/llvm-cov/Inputs/binary-formats.wasm.proftext (+4)
  • (modified) llvm/test/tools/llvm-cov/binary-formats.c (+7)
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
index 8881bffe41c57c..e507b241d98284 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
@@ -18,12 +18,14 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/ADT/StringRef.h"
+#include "llvm/BinaryFormat/Wasm.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/Binary.h"
 #include "llvm/Object/COFF.h"
 #include "llvm/Object/Error.h"
 #include "llvm/Object/MachOUniversal.h"
 #include "llvm/Object/ObjectFile.h"
+#include "llvm/Object/Wasm.h"
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Compression.h"
@@ -492,22 +494,29 @@ Expected<bool> RawCoverageMappingDummyChecker::isDummy() {
   return Tag == Counter::Zero;
 }
 
-Error InstrProfSymtab::create(SectionRef &Section) {
-  Expected<StringRef> DataOrErr = Section.getContents();
-  if (!DataOrErr)
-    return DataOrErr.takeError();
-  Data = *DataOrErr;
-  Address = Section.getAddress();
-
+/// Determine if we should skip the first byte of the section content
+static bool shouldSkipSectionFirstByte(SectionRef &Section) {
+  const ObjectFile *Obj = Section.getObject();
   // If this is a linked PE/COFF file, then we have to skip over the null byte
   // that is allocated in the .lprfn$A section in the LLVM profiling runtime.
   // If the name section is .lprfcovnames, it doesn't have the null byte at the
   // beginning.
-  const ObjectFile *Obj = Section.getObject();
   if (isa<COFFObjectFile>(Obj) && !Obj->isRelocatableObject())
     if (Expected<StringRef> NameOrErr = Section.getName())
       if (*NameOrErr != getInstrProfSectionName(IPSK_covname, Triple::COFF))
-        Data = Data.drop_front(1);
+        return true;
+  return false;
+}
+
+Error InstrProfSymtab::create(SectionRef &Section) {
+  Expected<StringRef> DataOrErr = Section.getContents();
+  if (!DataOrErr)
+    return DataOrErr.takeError();
+  Data = *DataOrErr;
+  Address = Section.getAddress();
+
+  if (shouldSkipSectionFirstByte(Section))
+    Data = Data.substr(1);
 
   return Error::success();
 }
@@ -1077,6 +1086,56 @@ lookupSections(ObjectFile &OF, InstrProfSectKind IPSK) {
   return Sections;
 }
 
+/// Find a section that matches \p Name and is allocatable at runtime.
+///
+/// Returns the contents of the section and its start offset in the object file.
+static Expected<std::pair<StringRef, uint64_t>>
+lookupAllocatableSection(ObjectFile &OF, InstrProfSectKind IPSK) {
+  // On Wasm, allocatable sections can live only in data segments.
+  if (auto *WOF = dyn_cast<WasmObjectFile>(&OF)) {
+    std::vector<const WasmSegment *> Segments;
+    auto ObjFormat = OF.getTripleObjectFormat();
+    auto Name =
+        getInstrProfSectionName(IPSK, ObjFormat, /*AddSegmentInfo=*/false);
+    for (const auto &DebugName : WOF->debugNames()) {
+      if (DebugName.Type != wasm::NameType::DATA_SEGMENT ||
+          DebugName.Name != Name)
+        continue;
+      if (DebugName.Index >= WOF->dataSegments().size())
+        return make_error<CoverageMapError>(coveragemap_error::malformed);
+      auto &Segment = WOF->dataSegments()[DebugName.Index];
+      Segments.push_back(&Segment);
+    }
+    if (Segments.empty())
+      return make_error<CoverageMapError>(coveragemap_error::no_data_found);
+    if (Segments.size() != 1)
+      return make_error<CoverageMapError>(coveragemap_error::malformed);
+
+    const auto &Segment = *Segments.front();
+    auto &Data = Segment.Data;
+    StringRef Content(reinterpret_cast<const char *>(Data.Content.data()),
+                      Data.Content.size());
+    return std::make_pair(Content, Segment.SectionOffset);
+  }
+
+  // On other object file types, delegate to lookupSections to find the section.
+  auto Sections = lookupSections(OF, IPSK);
+  if (!Sections)
+    return Sections.takeError();
+  if (Sections->size() != 1)
+    return make_error<CoverageMapError>(
+        coveragemap_error::malformed,
+        "the size of coverage mapping section is not one");
+  auto &Section = Sections->front();
+  auto ContentsOrErr = Section.getContents();
+  if (!ContentsOrErr)
+    return ContentsOrErr.takeError();
+  auto Content = *ContentsOrErr;
+  if (shouldSkipSectionFirstByte(Section))
+    Content = Content.drop_front(1);
+  return std::make_pair(Content, Section.getAddress());
+}
+
 static Expected<std::unique_ptr<BinaryCoverageReader>>
 loadBinaryFormat(std::unique_ptr<Binary> Bin, StringRef Arch,
                  StringRef CompilationDir = "",
@@ -1107,23 +1166,20 @@ loadBinaryFormat(std::unique_ptr<Binary> Bin, StringRef Arch,
 
   // Look for the sections that we are interested in.
   auto ProfileNames = std::make_unique<InstrProfSymtab>();
-  std::vector<SectionRef> NamesSectionRefs;
   // If IPSK_name is not found, fallback to search for IPK_covname, which is
   // used when binary correlation is enabled.
-  auto NamesSection = lookupSections(*OF, IPSK_name);
+  auto NamesSection = lookupAllocatableSection(*OF, IPSK_name);
   if (auto E = NamesSection.takeError()) {
     consumeError(std::move(E));
-    NamesSection = lookupSections(*OF, IPSK_covname);
+    NamesSection = lookupAllocatableSection(*OF, IPSK_covname);
     if (auto E = NamesSection.takeError())
       return std::move(E);
   }
-  NamesSectionRefs = *NamesSection;
 
-  if (NamesSectionRefs.size() != 1)
-    return make_error<CoverageMapError>(
-        coveragemap_error::malformed,
-        "the size of coverage mapping section is not one");
-  if (Error E = ProfileNames->create(NamesSectionRefs.back()))
+  uint64_t NamesAddress;
+  StringRef NamesContent;
+  std::tie(NamesContent, NamesAddress) = *NamesSection;
+  if (Error E = ProfileNames->create(NamesContent, NamesAddress))
     return std::move(E);
 
   auto CoverageSection = lookupSections(*OF, IPSK_covmap);
diff --git a/llvm/test/tools/llvm-cov/Inputs/binary-formats.v6.wasm32 b/llvm/test/tools/llvm-cov/Inputs/binary-formats.v6.wasm32
new file mode 100755
index 00000000000000..5a606d5a2f69fd
Binary files /dev/null and b/llvm/test/tools/llvm-cov/Inputs/binary-formats.v6.wasm32 differ
diff --git a/llvm/test/tools/llvm-cov/Inputs/binary-formats.wasm.proftext b/llvm/test/tools/llvm-cov/Inputs/binary-formats.wasm.proftext
new file mode 100644
index 00000000000000..20fc3816c2255a
--- /dev/null
+++ b/llvm/test/tools/llvm-cov/Inputs/binary-formats.wasm.proftext
@@ -0,0 +1,4 @@
+__main_argc_argv
+0x0
+1
+100
diff --git a/llvm/test/tools/llvm-cov/binary-formats.c b/llvm/test/tools/llvm-cov/binary-formats.c
index a5bfc012860ec3..bb61b288cfc62e 100644
--- a/llvm/test/tools/llvm-cov/binary-formats.c
+++ b/llvm/test/tools/llvm-cov/binary-formats.c
@@ -10,4 +10,11 @@ int main(int argc, const char *argv[]) {}
 // RUN: llvm-cov show %S/Inputs/binary-formats.v3.macho64l -instr-profile %t.profdata -path-equivalence=/tmp,%S %s | FileCheck %s
 // RUN: llvm-cov show %S/Inputs/binary-formats.v6.linux64l -instr-profile %t.profdata -path-equivalence=/tmp,%S %s | FileCheck %s
 
+// RUN: llvm-profdata merge %S/Inputs/binary-formats.wasm.proftext -o %t.wasm.profdata
+// NOTE: The wasm binary is built with the following command:
+//   clang -target wasm32-unknown-wasi %s -o %S/Inputs/binary-formats.v6.wasm32 \
+//     -mllvm -enable-name-compression=false \
+//     -fprofile-instr-generate -fcoverage-mapping -lwasi-emulated-getpid -lwasi-emulated-mman
+// RUN: llvm-cov show %S/Inputs/binary-formats.v6.wasm32 -instr-profile %t.wasm.profdata -path-equivalence=/tmp,%S %s | FileCheck %s
+
 // RUN: llvm-cov export %S/Inputs/binary-formats.macho64l -instr-profile %t.profdata | FileCheck %S/Inputs/binary-formats.canonical.json

@kateinoigakukun
Copy link
Member Author

@glandium Would you mind testing this patch with your PGO setup? 🙏

@glandium
Copy link
Contributor

Unfortunately, it still fails with the same error. I also noticed other error messages I had missed in the past (but after checking, they were there on the first landing):

LLVM Profile Error: Invalid profile data to merge
LLVM Profile Error: Profile Merging of file D:\task_172914326170851\fetches\llvm-project/build/profiles/10870457539148555413_0.profraw failed: File exists
LLVM Profile Error: Failed to write file "D:\task_172914326170851\fetches\llvm-project/build/profiles/10870457539148555413_0.profraw": File exists

@kateinoigakukun
Copy link
Member Author

@glandium Thank you. That's really helpful information. I think #112695 might fix the failure.

@mstorsjo
Copy link
Member

Unfortunately, it still fails with the same error. I also noticed other error messages I had missed in the past (but after checking, they were there on the first landing):

LLVM Profile Error: Invalid profile data to merge
LLVM Profile Error: Profile Merging of file D:\task_172914326170851\fetches\llvm-project/build/profiles/10870457539148555413_0.profraw failed: File exists
LLVM Profile Error: Failed to write file "D:\task_172914326170851\fetches\llvm-project/build/profiles/10870457539148555413_0.profraw": File exists

@glandium Is your PGO setup still broken, after the fixup/partial revert that was pushed in 157f10d? I.e. does this reapply attempt still break things for you, or are things still broken on main?

@mstorsjo
Copy link
Member

FWIW, this reland of the patch does indeed seem like it still works for llvm-cov for reading the profile data from Windows (on the sample files from yesterday). I haven't tested building a full toolchain with this in place and running the compiler-rt tests again. I don't know about the rest of the big picture here, or @glandium's PGO setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants