-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[LLDB] Fix Memory64 BaseRVA, move all non-stack memory to Mem64. #146777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Jacob Lalonde (Jlalond) ChangesContextOver a year ago, I landed support for 64b Memory ranges in Minidump (#95312). In this patch we added the Memory64 list stream, which is effectively a Linked List on disk. The layout is a sixteen byte header and then however many Memory descriptors. The BugThis is a classic off-by one error, where I added 8 bytes instead of 16 for the header. This caused the first region to start 8 bytes before the correct RVA, thus shifting all actual memory by 8 bytes. However because this artifically introduced 8 bytes to the first region, all subsequent regions remained aligned and thus uncorrupted Why wasn't this caught?One problem we've had is forcing Minidump to actually use the 64b mode, it would be a massive waste of resources to have a test that actually wrote >4.2gb of IO to validate the 64b regions, and so almost all validation has been manual. As a weakness of manual testing, this issue is psuedo non-deterministic, because which regions that end up as the first 64b region is up to chance. We try to fit all regions into 32b, only greedily moving regions in 64b if adding them to the 32b bucket would cause an overflow. Additionally, the order matters, and we effectively implicity get the order from Why is this showing up now?In #138206, as a part of my effort on SBSaveCore, I fixed some bugs where super regions and subregions were not being merged correctly. My hypothesis is because we're now merging adjacent regions with the same permission the chance of a valid data containing region ending up as the first region increased. Additionally, as I work on this we've had more users to report these errors that just appeared 'random' prior. How do we prevent future regressions?To prevent regressions, and honestly to save my sanity for figuring out where 4 bytes magically came from, I've added two new APIs to SBSaveCoreOptions.
```SBSaveCoreOptions::AddFlag()`` Leveraging these two new APIs, I added a new test class that creates a minidump and compares all the regions byte per byte to ensure we don't regress on the 64b region again. Snippet produced by minidump.py
Showing we are forcing 64b on the Test Minidump. Can we fix existing MinidumpsYes. None of the data itself is corrupted, and we're simply reading the first element 8 byte too early. I don't think we can automate this, but any Minidump produced with the LLDB stream on a date before this patch lands can have it's RVA updated by 8 and will work fine. I want to repeat, we were not corrupting the data, simply the read pointer is 8 bytes short of where it should begin. MiscAs a part of this fix I had to look at LLDB logs a lot, you'll notice I added CC: @DavidSpickett, @da-viper @labath because we've been working together on save-core plugins, review it optional and I didn't tag you but figured you'd want to know Patch is 20.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146777.diff 10 Files Affected:
diff --git a/lldb/bindings/interface/SBSaveCoreOptionsDocstrings.i b/lldb/bindings/interface/SBSaveCoreOptionsDocstrings.i
index 6907164a1b95c..a44c0569f40e6 100644
--- a/lldb/bindings/interface/SBSaveCoreOptionsDocstrings.i
+++ b/lldb/bindings/interface/SBSaveCoreOptionsDocstrings.i
@@ -63,6 +63,18 @@ Note that currently ELF Core files are not supported."
Get an SBThreadCollection of all threads marked to be saved. This collection is not sorted according to insertion order."
) lldb::SBSaveCoreOptions::GetThreadsToSave;
+
+%feature("docstring", "
+ Get an SBMemoryRegionInfoList of all the Regions that LLDB will attempt to write into the Core. Note, reading from these
+ regions can fail, and it's guaraunteed every region will be present. If called without a valid process or style set an empty
+ collection will be returned."
+) lldb::SBSaveCoreOptions::GetMemoryRegionsToSave;
+
+%feature("docstring", "
+ Add a plugin specific flag to the objects option."
+) lldb::SBSaveCoreOptions::AddFlag;
+
+
%feature("docstring", "
Get the current total number of bytes the core is expected to have, excluding the overhead of the core file format.
Requires both a Process and a Style to be specified. An error will be returned if the provided options would result in no data being saved."
diff --git a/lldb/include/lldb/API/SBMemoryRegionInfoList.h b/lldb/include/lldb/API/SBMemoryRegionInfoList.h
index 1d939dff55faa..8ac9c1aceb6f6 100644
--- a/lldb/include/lldb/API/SBMemoryRegionInfoList.h
+++ b/lldb/include/lldb/API/SBMemoryRegionInfoList.h
@@ -45,6 +45,7 @@ class LLDB_API SBMemoryRegionInfoList {
private:
friend class SBProcess;
+ friend class SBSaveCoreOptions;
lldb_private::MemoryRegionInfos &ref();
diff --git a/lldb/include/lldb/API/SBSaveCoreOptions.h b/lldb/include/lldb/API/SBSaveCoreOptions.h
index 37552c13d0f36..e72dd53780ab9 100644
--- a/lldb/include/lldb/API/SBSaveCoreOptions.h
+++ b/lldb/include/lldb/API/SBSaveCoreOptions.h
@@ -12,6 +12,7 @@
#include "lldb/API/SBDefines.h"
#include "lldb/API/SBError.h"
#include "lldb/API/SBFileSpec.h"
+#include "lldb/API/SBMemoryRegionInfoList.h"
#include "lldb/API/SBProcess.h"
#include "lldb/API/SBThread.h"
#include "lldb/API/SBThreadCollection.h"
@@ -119,6 +120,13 @@ class LLDB_API SBSaveCoreOptions {
/// an empty collection will be returned.
SBThreadCollection GetThreadsToSave() const;
+ /// Get an unsorted copy of all memory regions to save
+ ///
+ /// \returns
+ /// An unsorted copy of all memory regions to save. If no process or style
+ /// is specified an empty collection will be returned.
+ SBMemoryRegionInfoList GetMemoryRegionsToSave();
+
/// Get the current total number of bytes the core is expected to have
/// excluding the overhead of the core file format. Requires a Process and
/// Style to be specified.
@@ -132,6 +140,14 @@ class LLDB_API SBSaveCoreOptions {
/// The expected size of the data contained in the core in bytes.
uint64_t GetCurrentSizeInBytes(SBError &error);
+ /// Add a flag to be consumed by the specified plugin, null or empty flags
+ /// will be ignored.
+ ///
+ /// \note
+ /// This API is currently only used for testing, with forcing Minidumps to
+ /// to 64b memory list the reason this api was added
+ void AddFlag(const char *flag);
+
/// Reset all options.
void Clear();
diff --git a/lldb/include/lldb/Symbol/SaveCoreOptions.h b/lldb/include/lldb/Symbol/SaveCoreOptions.h
index da66b184745db..92e474c1131a1 100644
--- a/lldb/include/lldb/Symbol/SaveCoreOptions.h
+++ b/lldb/include/lldb/Symbol/SaveCoreOptions.h
@@ -9,6 +9,7 @@
#ifndef LLDB_SOURCE_PLUGINS_OBJECTFILE_SaveCoreOPTIONS_H
#define LLDB_SOURCE_PLUGINS_OBJECTFILE_SaveCoreOPTIONS_H
+#include "lldb/Target/CoreFileMemoryRanges.h"
#include "lldb/Target/ThreadCollection.h"
#include "lldb/Utility/FileSpec.h"
#include "lldb/Utility/RangeMap.h"
@@ -23,7 +24,7 @@ namespace lldb_private {
class SaveCoreOptions {
public:
- SaveCoreOptions(){};
+ SaveCoreOptions() = default;
~SaveCoreOptions() = default;
lldb_private::Status SetPluginName(const char *name);
@@ -47,10 +48,15 @@ class SaveCoreOptions {
void AddMemoryRegionToSave(const lldb_private::MemoryRegionInfo ®ion);
+ llvm::Expected<lldb_private::CoreFileMemoryRanges> GetMemoryRegionsToSave();
lldb_private::ThreadCollection::collection GetThreadsToSave() const;
llvm::Expected<uint64_t> GetCurrentSizeInBytes();
+ void AddFlag(const char *flag);
+
+ bool ContainsFlag(const char *flag) const;
+
void Clear();
private:
@@ -59,6 +65,7 @@ class SaveCoreOptions {
std::optional<std::string> m_plugin_name;
std::optional<lldb_private::FileSpec> m_file;
std::optional<lldb::SaveCoreStyle> m_style;
+ std::optional<std::unordered_set<std::string>> m_flags;
lldb::ProcessSP m_process_sp;
std::unordered_set<lldb::tid_t> m_threads_to_save;
MemoryRanges m_regions_to_save;
diff --git a/lldb/source/API/SBSaveCoreOptions.cpp b/lldb/source/API/SBSaveCoreOptions.cpp
index 15584abaac013..0d618bc1916b9 100644
--- a/lldb/source/API/SBSaveCoreOptions.cpp
+++ b/lldb/source/API/SBSaveCoreOptions.cpp
@@ -128,6 +128,32 @@ uint64_t SBSaveCoreOptions::GetCurrentSizeInBytes(SBError &error) {
return *expected_bytes;
}
+lldb::SBMemoryRegionInfoList SBSaveCoreOptions::GetMemoryRegionsToSave() {
+ LLDB_INSTRUMENT_VA(this);
+ llvm::Expected<lldb_private::CoreFileMemoryRanges> memory_ranges =
+ m_opaque_up->GetMemoryRegionsToSave();
+ if (!memory_ranges) {
+ llvm::consumeError(memory_ranges.takeError());
+ return SBMemoryRegionInfoList();
+ }
+
+ SBMemoryRegionInfoList m_memory_region_infos;
+ for (const auto &range : *memory_ranges) {
+ SBMemoryRegionInfo region_info(
+ nullptr, range.GetRangeBase(), range.GetRangeEnd(),
+ range.data.lldb_permissions, /*mapped=*/true);
+ m_memory_region_infos.Append(region_info);
+ }
+
+ return m_memory_region_infos;
+}
+
+void SBSaveCoreOptions::AddFlag(const char *flag) {
+ LLDB_INSTRUMENT_VA(this, flag);
+
+ m_opaque_up->AddFlag(flag);
+}
+
lldb_private::SaveCoreOptions &SBSaveCoreOptions::ref() const {
return *m_opaque_up;
}
diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
index 806f256d9da48..8d8d04d1b1aba 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
@@ -831,6 +831,13 @@ Status MinidumpFileBuilder::AddLinuxFileStreams() {
Status MinidumpFileBuilder::AddMemoryList() {
Status error;
+ // Note this is here for testing. In the past there has been many occasions
+ // that the 64b code has regressed because it's wasteful and expensive to
+ // write a 4.2gb+ on every CI run to get around this and to exercise this
+ // codepath we define a flag in the options object.
+ bool force_64b_for_non_threads =
+ m_save_core_options.ContainsFlag(FORCE_64B_FLAG);
+
// We first save the thread stacks to ensure they fit in the first UINT32_MAX
// bytes of the core file. Thread structures in minidump files can only use
// 32 bit memory descriptiors, so we emit them first to ensure the memory is
@@ -890,7 +897,7 @@ Status MinidumpFileBuilder::AddMemoryList() {
const addr_t range_size = core_range.range.size();
// We don't need to check for stacks here because we already removed them
// from all_core_memory_ranges.
- if (total_size + range_size < UINT32_MAX) {
+ if (!force_64b_for_non_threads && total_size + range_size < UINT32_MAX) {
ranges_32.push_back(core_range);
total_size += range_size;
} else {
@@ -977,6 +984,7 @@ Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
const lldb::addr_t addr = range.range.start();
const lldb::addr_t size = range.range.size();
Log *log = GetLog(LLDBLog::Object);
+ uint64_t total_bytes_read = 0;
Status addDataError;
Process::ReadMemoryChunkCallback callback =
[&](Status &error, lldb::addr_t current_addr, const void *buf,
@@ -984,7 +992,7 @@ Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
if (error.Fail() || bytes_read == 0) {
LLDB_LOGF(log,
"Failed to read memory region at: 0x%" PRIx64
- ". Bytes read: %" PRIx64 ", error: %s",
+ ". Bytes read: 0x%" PRIx64 ", error: %s",
current_addr, bytes_read, error.AsCString());
// If we failed in a memory read, we would normally want to skip
@@ -997,6 +1005,13 @@ Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
return lldb_private::IterationAction::Stop;
}
+ if (current_addr != addr + total_bytes_read) {
+ LLDB_LOGF(log,
+ "Current addr is at expected address, 0x%" PRIx64
+ ", expected at 0x%" PRIx64,
+ current_addr, addr + total_bytes_read);
+ }
+
// Write to the minidump file with the chunk potentially flushing to
// disk.
// This error will be captured by the outer scope and is considered fatal.
@@ -1006,13 +1021,13 @@ Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
if (addDataError.Fail())
return lldb_private::IterationAction::Stop;
+ total_bytes_read += bytes_read;
// If we have a partial read, report it, but only if the partial read
// didn't finish reading the entire region.
- if (bytes_read != data_buffer.GetByteSize() &&
- current_addr + bytes_read != size) {
+ if (bytes_read != data_buffer.GetByteSize() && total_bytes_read != size) {
LLDB_LOGF(log,
- "Memory region at: %" PRIx64 " partiall read 0x%" PRIx64
- " bytes out of %" PRIx64 " bytes.",
+ "Memory region at: 0x%" PRIx64 " partial read 0x%" PRIx64
+ " bytes out of 0x%" PRIx64 " bytes.",
current_addr, bytes_read,
data_buffer.GetByteSize() - bytes_read);
@@ -1059,7 +1074,7 @@ MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
LLDB_LOGF(log,
"AddMemoryList %zu/%zu reading memory for region "
- "(%" PRIx64 " bytes) [%" PRIx64 ", %" PRIx64 ")",
+ "(0x%" PRIx64 " bytes) [0x%" PRIx64 ", 0x%" PRIx64 ")",
region_index, ranges.size(), size, addr, addr + size);
++region_index;
@@ -1130,9 +1145,9 @@ MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
// Capture the starting offset for all the descriptors so we can clean them up
// if needed.
offset_t starting_offset =
- GetCurrentDataEndOffset() + sizeof(llvm::support::ulittle64_t);
+ GetCurrentDataEndOffset() + sizeof(llvm::minidump::Memory64ListHeader);
// The base_rva needs to start after the directories, which is right after
- // this 8 byte variable.
+ // the descriptors + the size of the header.
offset_t base_rva =
starting_offset +
(ranges.size() * sizeof(llvm::minidump::MemoryDescriptor_64));
diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
index 46b20f90138fe..cc2520d7f0b16 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
@@ -176,6 +176,8 @@ class MinidumpFileBuilder {
static constexpr size_t HEADER_SIZE = sizeof(llvm::minidump::Header);
static constexpr size_t DIRECTORY_SIZE = sizeof(llvm::minidump::Directory);
+ static constexpr const char FORCE_64B_FLAG[] = "force_64b";
+
// More that one place can mention the register thread context locations,
// so when we emit the thread contents, remember where it is so we don't have
// to duplicate it in the exception data.
diff --git a/lldb/source/Symbol/SaveCoreOptions.cpp b/lldb/source/Symbol/SaveCoreOptions.cpp
index f93b58f59cf96..6b6387f821590 100644
--- a/lldb/source/Symbol/SaveCoreOptions.cpp
+++ b/lldb/source/Symbol/SaveCoreOptions.cpp
@@ -155,6 +155,24 @@ SaveCoreOptions::GetThreadsToSave() const {
return thread_collection;
}
+llvm::Expected<lldb_private::CoreFileMemoryRanges>
+SaveCoreOptions::GetMemoryRegionsToSave() {
+ Status error;
+ if (!m_process_sp)
+ return Status::FromErrorString("Requires a process to be set.").takeError();
+
+ error = EnsureValidConfiguration(m_process_sp);
+ if (error.Fail())
+ return error.takeError();
+
+ CoreFileMemoryRanges ranges;
+ error = m_process_sp->CalculateCoreFileSaveRanges(*this, ranges);
+ if (error.Fail())
+ return error.takeError();
+
+ return ranges;
+}
+
llvm::Expected<uint64_t> SaveCoreOptions::GetCurrentSizeInBytes() {
Status error;
if (!m_process_sp)
@@ -169,8 +187,14 @@ llvm::Expected<uint64_t> SaveCoreOptions::GetCurrentSizeInBytes() {
if (error.Fail())
return error.takeError();
+ llvm::Expected<lldb_private::CoreFileMemoryRanges> core_file_ranges_maybe =
+ GetMemoryRegionsToSave();
+ if (!core_file_ranges_maybe)
+ return core_file_ranges_maybe.takeError();
+ const lldb_private::CoreFileMemoryRanges &core_file_ranges =
+ *core_file_ranges_maybe;
uint64_t total_in_bytes = 0;
- for (auto &core_range : ranges)
+ for (auto &core_range : core_file_ranges)
total_in_bytes += core_range.data.range.size();
return total_in_bytes;
@@ -190,3 +214,17 @@ void SaveCoreOptions::Clear() {
m_process_sp.reset();
m_regions_to_save.Clear();
}
+
+void SaveCoreOptions::AddFlag(const char *flag) {
+ if (!flag || !flag[0])
+ return;
+
+ if (!m_flags)
+ m_flags = std::unordered_set<std::string>();
+
+ m_flags->emplace(std::string(flag));
+}
+
+bool SaveCoreOptions::ContainsFlag(const char *flag) const {
+ return m_flags && m_flags->find(flag) != m_flags->end();
+}
diff --git a/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump64b.py b/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump64b.py
new file mode 100644
index 0000000000000..b1ce13a047439
--- /dev/null
+++ b/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump64b.py
@@ -0,0 +1,100 @@
+"""
+Test saving a minidumps with the force 64b flag, and evaluate that every
+saved memory region is byte-wise 1:1 with the live process.
+"""
+
+import os
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+# Constant from MinidumpFileBuilder.h, this forces 64b for non threads
+FORCE_64B = "force_64b"
+
+
+class ProcessSaveCoreMinidump64bTestCase(TestBase):
+
+ def verify_minidump(
+ self,
+ core_proc,
+ live_proc,
+ options,
+ ):
+ """Verify that the minidump is the same byte for byte as the live process."""
+ # Get the memory regions we saved off in this core, we can't compare to the core
+ # because we pull from /proc/pid/maps, so even ranges that don't get mapped in will show up
+ # as ranges in the minidump.
+ #
+ # Instead, we have an API that returns to us the number of regions we planned to save from the live process
+ # and we compare those
+ memory_regions_to_compare = options.GetMemoryRegionsToSave()
+
+ for region in memory_regions_to_compare:
+ start_addr = region.GetRegionBase()
+ end_addr = region.GetRegionEnd()
+ actual_process_read_error = lldb.SBError()
+ actual = live_proc.ReadMemory(
+ start_addr, end_addr - start_addr, actual_process_read_error
+ )
+ expected_process_read_error = lldb.SBError()
+ expected = core_proc.ReadMemory(
+ start_addr, end_addr - start_addr, expected_process_read_error
+ )
+
+ # Both processes could fail to read a given memory region, so if they both pass
+ # compare, then we'll fail them if the core differs from the live process.
+ if (
+ actual_process_read_error.Success()
+ and expected_process_read_error.Success()
+ ):
+ self.assertEqual(
+ actual, expected, "Bytes differ between live process and core"
+ )
+
+ # Now we check if the error is the same, error isn't abnormal but they should fail for the same reason
+ self.assertTrue(
+ (
+ actual_process_read_error.Success()
+ and expected_process_read_error.Success()
+ )
+ or (
+ actual_process_read_error.Fail()
+ and expected_process_read_error.Fail()
+ )
+ )
+
+ @skipUnlessArch("x86_64")
+ @skipUnlessPlatform(["linux"])
+ def test_minidump_save_style_full(self):
+ """Test that a full minidump is the same byte for byte."""
+
+ self.build()
+ exe = self.getBuildArtifact("a.out")
+ minidump_path = self.getBuildArtifact("minidump_full_force64b.dmp")
+
+ try:
+ target = self.dbg.CreateTarget(exe)
+ live_process = target.LaunchSimple(
+ None, None, self.get_process_working_directory()
+ )
+ self.assertState(live_process.GetState(), lldb.eStateStopped)
+ options = lldb.SBSaveCoreOptions()
+
+ options.SetOutputFile(lldb.SBFileSpec(minidump_path))
+ options.SetStyle(lldb.eSaveCoreFull)
+ options.SetPluginName("minidump")
+ options.SetProcess(live_process)
+ options.AddFlag(FORCE_64B)
+
+ error = live_process.SaveCore(options)
+ self.assertTrue(error.Success(), error.GetCString())
+
+ target = self.dbg.CreateTarget(None)
+ core_proc = target.LoadCore(minidump_path)
+
+ self.verify_minidump(core_proc, live_process, options)
+ finally:
+ self.assertTrue(self.dbg.DeleteTarget(target))
+ if os.path.isfile(minidump_path):
+ os.unlink(minidump_path)
diff --git a/lldb/test/API/python_api/sbsavecoreoptions/TestSBSaveCoreOptions.py b/lldb/test/API/python_api/sbsavecoreoptions/TestSBSaveCoreOptions.py
index 31e35e0285f17..92ca44ecbbffc 100644
--- a/lldb/test/API/python_api/sbsavecoreoptions/TestSBSaveCoreOptions.py
+++ b/lldb/test/API/python_api/sbsavecoreoptions/TestSBSaveCoreOptions.py
@@ -164,3 +164,46 @@ def test_get_total_in_bytes_missing_requirements(self):
options.SetStyle(lldb.eSaveCoreCustomOnly)
total = options.GetCurrentSizeInBytes(error)
self.assertTrue(error.Fail(), error.GetCString())
+
+ def test_get_memory_regions_to_save(self):
+ """
+ Tests the matrix of responses for GetMemoryRegionsToSave
+ """
+
+ options = lldb.SBSaveCoreOptions()
+
+ # Not specifying plugin or process should return an empty list.
+ memory_list = options.GetMemoryRegionsToSave()
+ self.assertEqual(0, memory_list.GetSize())
+
+ # No style returns an empty list
+ process = self.get_basic_process()
+ options.SetProcess(process)
+ memory_list = options.GetMemoryRegionsToSave()
+ self.assertEqual(0, memory_list.GetSize())
+ options.Clear()
+
+ # No Process returns an empty list
+ options.SetStyle(lldb.eSaveCoreCustomOnly)
+ memory_list = options.GetMemoryRegionsToSave()
+ self.assertEqual(0, memory_list.GetSize())
+ options.Clear()
+
+ # Validate we get back the single region we populate
+ options.SetStyle(lldb.eSaveCoreCustomOnly)
+ process = self.get_basic_process()
+ options.SetProcess(process)
+ memory_range = lldb.SBMemoryRegionInfo()
+
+ # Add the memory range of 0x1000-0x1100
+ process.GetMemoryRegionInfo(0x1000, memory_range)
+ options.AddMemoryRegionToSave(memory_range)
+ memory_list = options.GetMemoryRegionsToSave()
+ self.assertEqual(1, memory_list.GetSize())
+ read_region = lldb.SBMemoryRegionInfo()
+ memory_list.GetMemoryRegionAtIndex(0, read_region)
+
+ # Permissions from Process getLLDBRegion aren't matching up with
+ # the live process permissions, so we're just checking the range for now.
+ self.assertEqual(mem...
[truncated]
|
✅ With the latest revision this PR passed the Python code formatter. |
Can you please split the PR into two? One for improving testing/diagnosibility, the other for the real fix? |
@jeffreytan81 Sure, but I'll have to revisit after I return from vacation |
Alternative proposal and one @clayborg and I talked about for awhile. Why don't we just put all the non-stacks into Memory64List? Originally I didn't want to change this out of not being familiar with how everything works, but now that we have a 64b list I think it makes sense to fill mem32 with everything that can only support 32b offsets then fill the 64b list. This removes the need to pass flags (which is not a design decision I like but I didn't see many alternatives). @labath any opposition to putting everything in mem64? |
I don't have an opinion on that. I'm not currently involved in minidumps. What I am not excited about is the idea of creating stringly-typed forever stable API for the purpose of testing a fix like this. As for alternatives, an "obvious" one is a unit tests. If you manage (I think it should be possible) instantiate the minidump writer with a mock process, then you can call any API you want, such as one that writes a 64-bit region (but still pass it a small memory block). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove flags from all APIs. Enable 64 bit for any non stack memory regions to allow us to test the 64 bit memory regions all of the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the process subclasses that save core files use the llvm::Expected<lldb_private::CoreFileMemoryRanges> GetMemoryRegionsToSave();
from SaveCoreOptions
. This will ensure everyone is using the same list.
662f38c
to
9b52449
Compare
I made MinidumpFileBuilder use this API. I will make a follow up patch for Mach-O, because this patch is already quite large and I would prefer landing Mac changes separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Your link to minidump.py in the comment does not link to a python file, but just the scripts repo and I don't see that script in the repo.
@@ -45,6 +45,10 @@ Note that currently ELF Core files are not supported." | |||
Resetting will result in the reset of all process specific options, such as Threads to save." | |||
) lldb::SBSaveCoreOptions::SetProcess; | |||
|
|||
%feature("docstring", " | |||
Get the process to save. If undefined, an invalid SBProcess will be returned." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would make it be undefined? If its obvious for those in the know then its fine, but might be useful to expand the doc a little.
edit: Or maybe just copy the doc comment below which reads a bit better to me.
@@ -1130,9 +1122,9 @@ MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges, | |||
// Capture the starting offset for all the descriptors so we can clean them up | |||
// if needed. | |||
offset_t starting_offset = | |||
GetCurrentDataEndOffset() + sizeof(llvm::support::ulittle64_t); | |||
GetCurrentDataEndOffset() + sizeof(llvm::minidump::Memory64ListHeader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the actual bug fix right? I wonder if it would be better to put this as a separate commit. It maybe is hard to test without the other changes, but it would at least clearly separate the bug fix from the change in behavior.
Makes it safer in case we need to revert the change to 64b by default that we would not also revert the bug fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm torn on this, as yes it can be checked in independently but I think without the testing changes it continues the trend of 'fixes with no tests'.
) | ||
|
||
# Now we check if the error is the same, error isn't abnormal but they should fail for the same reason | ||
self.assertTrue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify this to
self.assertEqual(
actual_process_read_error.Success(),
expected_process_read_error.Success(),
f"Address range {hex(start_addr)} - {hex(end_addr)} mismatch in read results success between live process and core. Live: {actual_process_read_error.Success()}, Core: {expected_process_read_error.Success()},
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, because we can fail to read a region in both the live process and the expected, and we shouldn't fail the test in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can fail to read a region in both the live process and the expected, and we shouldn't fail the test in that case
But doesn't the check of actual_process_read_error.Success() == actual_process_read_error.Success()
still work in that case since they will both be False
?
|
||
@skipUnlessArch("x86_64") | ||
@skipUnlessPlatform(["linux"]) | ||
def test_minidump_save_style_full(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these tests exactly the same except for the minidump options? Might be worth factoring out the common checks and passing in the options to use from each test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
read_region = lldb.SBMemoryRegionInfo() | ||
memory_list.GetMemoryRegionAtIndex(0, read_region) | ||
|
||
# Permissions from Process getLLDBRegion aren't matching up with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a bug that should be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, I'm not sure if this is failing because of my own mental misunderstanding or a bug. I need to investigate further.
313fb2f
to
8e0e0e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
lldb/source/Core/PluginManager.cpp
Outdated
if (!options.GetProcess()) | ||
options.SetProcess(process_sp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be doing this before calling SaveCore as what if this process doesn't match the process that is on the options. So have the code that calls this fill in the process ahead of time in the lldb_private::SaveCoreOptions
instance. So always extract the process from the lldb_private::SaveCoreOptions
instance and don't pass a process into this function. The code that calls this will need to ensure the process instances match.
lldb/source/Core/PluginManager.cpp
Outdated
// Make sure the process sp is the same as the one we are using. | ||
if (options.GetProcess() != process_sp) { | ||
error = Status::FromErrorString( | ||
"Save Core Options configured for a different process."); | ||
return error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to the code that calls it.
✅ With the latest revision this PR passed the C/C++ code formatter. |
…t true, and fix where we write the base RVA 8 bytes earlier than it starts
…the process in PluginManager
2e7fc38
to
efed2cc
Compare
Context
Over a year ago, I landed support for 64b Memory ranges in Minidump (#95312). In this patch we added the Memory64 list stream, which is effectively a Linked List on disk. The layout is a sixteen byte header and then however many Memory descriptors.
The Bug
This is a classic off-by one error, where I added 8 bytes instead of 16 for the header. This caused the first region to start 8 bytes before the correct RVA, thus shifting all memory reads by 8 bytes. We are correctly writing all the regions to disk correctly, with no physical corruption but the RVA is defined wrong, meaning we were incorrectly reading memory
Why wasn't this caught?
One problem we've had is forcing Minidump to actually use the 64b mode, it would be a massive waste of resources to have a test that actually wrote >4.2gb of IO to validate the 64b regions, and so almost all validation has been manual. As a weakness of manual testing, this issue is psuedo non-deterministic, as what regions end up in 64b or 32b is handled greedily and iterated in the order it's laid out in /proc/pid/maps. We often validated 64b was written correctly by hexdumping the Minidump itself, which was not corrupted (other than the BaseRVA)
Why is this showing up now?
During internal usage, we had a bug report that the Minidump wasn't displaying values. I was unable to repro the issue, but during my investigation I saw the variables were in the 64b regions which resulted in me identifying the bug.
How do we prevent future regressions?
To prevent regressions, and honestly to save my sanity for figuring out where 8 bytes magically came from, I've added a new API to SBSaveCoreOptions.
SBSaveCoreOptions::GetMemoryRegionsToSave()
The ability to get the memory regions that we intend to include in the Coredump. I added this so we can compare what we intended to include versus what was actually included. Traditionally we've always had issues comparing regions because Minidump includes
/proc/pid/maps
and it can be difficult to know what memoryregion read failure was a genuine error or just a page that wasn't meant to be included.We are also leveraging this API to choose the memory regions to be generated, as well as for testing what regions should be bytewise 1:1.
After much debate with @clayborg, I've moved all non-stack memory to the Memory64 List. This list doesn't incur us any meaningful overhead and Greg originally suggested doing this in the original 64b PR. This also means we're exercising the 64b path every single time we save a Minidump, preventing regressions on this feature from slipping through testing in the future.
Snippet produced by minidump.py
Misc
As a part of this fix I had to look at LLDB logs a lot, you'll notice I added
0x
to many of the PRIx64LLDB_LOGF
. This is so the user (or I) can directly copy paste the address in the logs instead of adding the hex prefix themselves.Added some SBSaveCore tests for the new GetMemoryAPI, and Docstrings.
CC: @DavidSpickett, @da-viper @labath because we've been working together on save-core plugins, review it optional and I didn't tag you but figured you'd want to know