-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Coverage] Use the proper endianness for reading profile data fields #136427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
After running the program compiled with option `--profile-correlate` for big-endian target in qemu, profraw file will be generated in big-endian format. Then when llvm-profdata processes this profraw file on little-endian host, an error occurs due to "malformed instrumentation profile data". The following happens. When using llvm-profdata to create profdata file, elements for each function are pushed into a vector of per-function profile data structures called `Data`. The elements of `Data` contain profile metadata for the corresponding functions. To complete the elements in correlation mode, the tool refers to debug info or a binary file to obtain the necessary metadata of the corresponding function and then writes this metadata to the element structure in the host endianness. After this, llvm-profdata attempts to read fields of the structure considering the host and profraw file endianness difference. It suggests that this structure was mapped to memory from the profraw file. So, it swaps bytes of the structure's fields and reads it in the wrong endianness. To fix this, read the `Data` element fields without swapping bytes in correlation mode.
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-pgo Author: Roman Beliaev (belyaevrd) ChangesAfter running the program compiled with option The following happens. When using llvm-profdata to create profdata file, elements for each function are pushed into a vector of per-function profile data structures called To fix this, read the Full diff: https://github.com/llvm/llvm-project/pull/136427.diff 3 Files Affected:
diff --git a/llvm/include/llvm/ProfileData/InstrProfReader.h b/llvm/include/llvm/ProfileData/InstrProfReader.h
index f1010b312ee56..2ad27b9925038 100644
--- a/llvm/include/llvm/ProfileData/InstrProfReader.h
+++ b/llvm/include/llvm/ProfileData/InstrProfReader.h
@@ -490,7 +490,7 @@ class RawInstrProfReader : public InstrProfReader {
}
StringRef getName(uint64_t NameRef) const {
- return Symtab->getFuncOrVarName(swap(NameRef));
+ return Symtab->getFuncOrVarName(Correlator ? NameRef : swap(NameRef));
}
int getCounterTypeSize() const {
diff --git a/llvm/lib/ProfileData/InstrProfCorrelator.cpp b/llvm/lib/ProfileData/InstrProfCorrelator.cpp
index d92107f93dc56..8034cab5bd3f5 100644
--- a/llvm/lib/ProfileData/InstrProfCorrelator.cpp
+++ b/llvm/lib/ProfileData/InstrProfCorrelator.cpp
@@ -291,7 +291,7 @@ void InstrProfCorrelatorImpl<IntPtrT>::addDataProbe(uint64_t NameRef,
maybeSwap<uint64_t>(CFGHash),
// In this mode, CounterPtr actually stores the section relative address
// of the counter.
- maybeSwap<IntPtrT>(CounterOffset),
+ CounterOffset,
// TODO: MC/DC is not yet supported.
/*BitmapOffset=*/maybeSwap<IntPtrT>(0),
maybeSwap<IntPtrT>(FunctionPtr),
diff --git a/llvm/lib/ProfileData/InstrProfReader.cpp b/llvm/lib/ProfileData/InstrProfReader.cpp
index 4075b513c218d..b2cf592ee69d2 100644
--- a/llvm/lib/ProfileData/InstrProfReader.cpp
+++ b/llvm/lib/ProfileData/InstrProfReader.cpp
@@ -553,10 +553,11 @@ Error RawInstrProfReader<IntPtrT>::createSymtab(InstrProfSymtab &Symtab) {
StringRef(VNamesStart, VNamesEnd - VNamesStart)))
return error(std::move(E));
for (const RawInstrProf::ProfileData<IntPtrT> *I = Data; I != DataEnd; ++I) {
- const IntPtrT FPtr = swap(I->FunctionPointer);
+ const IntPtrT FPtr =
+ Correlator ? I->FunctionPointer : swap(I->FunctionPointer);
if (!FPtr)
continue;
- Symtab.mapAddress(FPtr, swap(I->NameRef));
+ Symtab.mapAddress(FPtr, Correlator ? I->NameRef : swap(I->NameRef));
}
if (VTableBegin != nullptr && VTableEnd != nullptr) {
@@ -711,18 +712,20 @@ Error RawInstrProfReader<IntPtrT>::readName(NamedInstrProfRecord &Record) {
template <class IntPtrT>
Error RawInstrProfReader<IntPtrT>::readFuncHash(NamedInstrProfRecord &Record) {
- Record.Hash = swap(Data->FuncHash);
+ Record.Hash = Correlator ? Data->FuncHash : swap(Data->FuncHash);
return success();
}
template <class IntPtrT>
Error RawInstrProfReader<IntPtrT>::readRawCounts(
InstrProfRecord &Record) {
- uint32_t NumCounters = swap(Data->NumCounters);
+ uint32_t NumCounters =
+ Correlator ? Data->NumCounters : swap(Data->NumCounters);
if (NumCounters == 0)
return error(instrprof_error::malformed, "number of counters is zero");
- ptrdiff_t CounterBaseOffset = swap(Data->CounterPtr) - CountersDelta;
+ ptrdiff_t CounterBaseOffset =
+ Correlator ? Data->CounterPtr : swap(Data->CounterPtr) - CountersDelta;
if (CounterBaseOffset < 0)
return error(
instrprof_error::malformed,
@@ -754,8 +757,8 @@ Error RawInstrProfReader<IntPtrT>::readRawCounts(
uint64_t TimestampValue = swap(*reinterpret_cast<const uint64_t *>(Ptr));
if (TimestampValue != 0 &&
TimestampValue != std::numeric_limits<uint64_t>::max()) {
- TemporalProfTimestamps.emplace_back(TimestampValue,
- swap(Data->NameRef));
+ TemporalProfTimestamps.emplace_back(
+ TimestampValue, Correlator ? Data->NameRef : swap(Data->NameRef));
TemporalProfTraceStreamSize = 1;
}
if (hasSingleByteCoverage()) {
@@ -785,7 +788,8 @@ Error RawInstrProfReader<IntPtrT>::readRawCounts(
template <class IntPtrT>
Error RawInstrProfReader<IntPtrT>::readRawBitmapBytes(InstrProfRecord &Record) {
- uint32_t NumBitmapBytes = swap(Data->NumBitmapBytes);
+ uint32_t NumBitmapBytes =
+ Correlator ? Data->NumBitmapBytes : swap(Data->NumBitmapBytes);
Record.BitmapBytes.clear();
Record.BitmapBytes.reserve(NumBitmapBytes);
@@ -796,7 +800,8 @@ Error RawInstrProfReader<IntPtrT>::readRawBitmapBytes(InstrProfRecord &Record) {
return success();
// BitmapDelta decreases as we advance to the next data record.
- ptrdiff_t BitmapOffset = swap(Data->BitmapPtr) - BitmapDelta;
+ ptrdiff_t BitmapOffset =
+ Correlator ? Data->BitmapPtr : swap(Data->BitmapPtr) - BitmapDelta;
if (BitmapOffset < 0)
return error(
instrprof_error::malformed,
|
Can you include a test? |
After running the program compiled with option
--profile-correlate
for big-endian target in qemu, profraw file will be generated in big-endian format. Then when llvm-profdata processes this profraw file on little-endian host, an error occurs due to "malformed instrumentation profile data".The following happens. When using llvm-profdata to create profdata file, elements for each function are pushed into a vector of per-function profile data structures called
Data
. The elements ofData
contain profile metadata for the corresponding functions. To complete the elements in correlation mode, the tool refers to debug info or a binary file to obtain the necessary metadata of the corresponding function and then writes this metadata to the element structure in the host endianness. After this, llvm-profdata attempts to read fields of the structure considering the host and profraw file endianness difference. It suggests that this structure was mapped to memory from the profraw file. So, it swaps bytes of the structure's fields and reads it in the wrong endianness.To fix this, read the
Data
element fields without swapping bytes in correlation mode.