[LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks #129307

Jlalond · 2025-02-28T21:02:54Z

I recently received an internal error report that LLDB was OOM'ing when creating a Minidump. In my 64b refactor we made a decision to acquire buffers the size of the largest memory region so we could read all of the contents in one call. This made error handling very simple (and simpler coding for me!) but had the trade off of large allocations if huge pages were enabled.

This patch is one I've had on the back burner for awhile, but we can read and write the Minidump memory sections in discrete chunks which we already do for writing to disk.

I had to refactor the error handling a bit, but it remains the same. We make a best effort attempt to read as much of the memory region as possible, but fail immediately if we receive an error writing to disk. I did not add new tests for this because our existing test suite is quite good, but I did manually verify a few Minidumps couldn't read beyond the red_zone.

(lldb) reg read $sp
     rsp = 0x00007fffffffc3b0
(lldb) p/x 0x00007fffffffc3b0 - 128
(long) 0x00007fffffffc330
(lldb) memory read 0x00007fffffffc330
0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00  `.......`.......
0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00  `.......e.&.....
(lldb) memory read 0x00007fffffffc329
error: could not parse memory info (Success!)

I'm not sure how to quantify the memory improvement other than we would allocate the largest size regardless of the size. So a 2gb unreadable region would cause a 2gb allocation even if we were reading 4096 kb. Now we will take the range size or the max chunk size of 128 mb.

llvmbot · 2025-02-28T21:03:27Z

@llvm/pr-subscribers-lldb

Author: Jacob Lalonde (Jlalond)

Changes

I recently received an internal error report that LLDB was OOM'ing when creating a Minidump. In my 64b refactor we made a decision to acquire buffers the size of the largest memory region so we could read all of the contents in one call. This made error handling very simple (and simpler coding for me!) but had the trade off of large allocations if huge pages were enabled.

This patch is one I've had on the back burner for awhile, but we can read and write the Minidump memory sections in discrete chunks which we already do for writing to disk.

I had to refactor the error handling a bit, but it remains the same. We make a best effort attempt to read as much of the memory region as possible, but fail immediately if we receive an error writing to disk. I did not add new tests for this because our existing test suite is quite good, but I did manually verify a few Minidumps couldn't read beyond the red_zone.

(lldb) reg read $sp
     rsp = 0x00007fffffffc3b0
(lldb) p/x 0x00007fffffffc3b0 - 128
(long) 0x00007fffffffc330
(lldb) memory read 0x00007fffffffc330
0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00  `.......`.......
0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00  `.......e.&amp;.....
(lldb) memory read 0x00007fffffffc329
error: could not parse memory info

I'm not sure how to quantify the memory improvement other than we would allocate the largest size regardless of the size. So a 2gb unreadable region would cause a 2gb allocation even if we were reading 4096 kb. Now we will take the range size or the max chunk size of 128 mb.

Full diff: https://github.com/llvm/llvm-project/pull/129307.diff

2 Files Affected:

(modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp (+90-40)
(modified) lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h (+7)

diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
index c5013ea5e3be4..e88b606f279cd 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp
@@ -969,12 +969,82 @@ Status MinidumpFileBuilder::DumpDirectories() const {
   return error;
 }
 
-static uint64_t
-GetLargestRangeSize(const std::vector<CoreFileMemoryRange> &ranges) {
-  uint64_t max_size = 0;
-  for (const auto &core_range : ranges)
-    max_size = std::max(max_size, core_range.range.size());
-  return max_size;
+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
+    const lldb_private::CoreFileMemoryRange &range, uint64_t *bytes_read) {
+  Log *log = GetLog(LLDBLog::Object);
+  lldb::addr_t addr = range.range.start();
+  lldb::addr_t size = range.range.size();
+  // First we set the byte tally to 0, so if we do exit gracefully
+  // the caller doesn't think the random garbage on the stack is a
+  // success.
+  if (bytes_read)
+    *bytes_read = 0;
+
+  uint64_t bytes_remaining = size;
+  uint64_t total_bytes_read = 0;
+  auto data_up = std::make_unique<DataBufferHeap>(
+      std::min(bytes_remaining, MAX_WRITE_CHUNK_SIZE), 0);
+  Status error;
+  while (bytes_remaining > 0) {
+    // Get the next read chunk size as the minimum of the remaining bytes and
+    // the write chunk max size.
+    const size_t bytes_to_read =
+        std::min(bytes_remaining, MAX_WRITE_CHUNK_SIZE);
+    const size_t bytes_read_for_chunk =
+        m_process_sp->ReadMemory(range.range.start() + total_bytes_read,
+                                 data_up->GetBytes(), bytes_to_read, error);
+    if (error.Fail() || bytes_read_for_chunk == 0) {
+      LLDB_LOGF(log,
+                "Failed to read memory region at: %" PRIx64
+                ". Bytes read: %zu, error: %s",
+                addr, bytes_read_for_chunk, error.AsCString());
+      // If we've only read one byte we can just give up and return
+      if (total_bytes_read == 0)
+        return Status();
+
+      // If we've read some bytes, we stop trying to read more and return
+      // this best effort attempt
+      bytes_remaining = 0;
+    } else if (bytes_read_for_chunk != bytes_to_read) {
+      LLDB_LOGF(
+          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " bytes",
+          addr, size);
+
+      // If we've read some bytes, we stop trying to read more and return
+      // this best effort attempt
+      bytes_remaining = 0;
+    }
+
+    // Write to the minidump file with the chunk potentially flushing to disk.
+    // this is the only place we want to return a true error, so that we can
+    // fail. If we get an error writing to disk we can't easily gaurauntee
+    // that we won't corrupt the minidump.
+    error = AddData(data_up->GetBytes(), bytes_read_for_chunk);
+    if (error.Fail())
+      return error;
+
+    // This check is so we don't overflow when the error code above sets the
+    // bytes to read to 0 (the graceful exit condition).
+    if (bytes_remaining > 0)
+      bytes_remaining -= bytes_read_for_chunk;
+
+    total_bytes_read += bytes_read_for_chunk;
+    // If the caller wants a tally back of the bytes_read, update it as we
+    // write. We do this in the loop so if we encounter an error we can
+    // report the accurate total.
+    if (bytes_read)
+      *bytes_read += bytes_read_for_chunk;
+
+    // We clear the heap per loop, without checking if we
+    // read the expected bytes this is so we don't allocate
+    // more than the MAX_WRITE_CHUNK_SIZE. But we do check if
+    // this is even worth clearing before we return and
+    // destruct the heap.
+    if (bytes_remaining > 0)
+      data_up->Clear();
+  }
+
+  return error;
 }
 
 Status
@@ -987,8 +1057,6 @@ MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
 
   Log *log = GetLog(LLDBLog::Object);
   size_t region_index = 0;
-  auto data_up =
-      std::make_unique<DataBufferHeap>(GetLargestRangeSize(ranges), 0);
   for (const auto &core_range : ranges) {
     // Take the offset before we write.
     const offset_t offset_for_data = GetCurrentDataEndOffset();
@@ -1003,18 +1071,15 @@ MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
     ++region_index;
 
     progress.Increment(1, "Adding Memory Range " + core_range.Dump());
-    const size_t bytes_read =
-        m_process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail() || bytes_read == 0) {
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
-                bytes_read, error.AsCString());
-      // Just skip sections with errors or zero bytes in 32b mode
+    uint64_t bytes_read;
+    error = ReadWriteMemoryInChunks(core_range, &bytes_read);
+    if (error.Fail())
+      return error;
+
+    // If we completely failed to read this range
+    // we can just omit any of the book keeping.
+    if (bytes_read == 0)
       continue;
-    } else if (bytes_read != size) {
-      LLDB_LOGF(
-          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " bytes",
-          addr, size);
-    }
 
     MemoryDescriptor descriptor;
     descriptor.StartOfMemoryRange =
@@ -1026,11 +1091,6 @@ MinidumpFileBuilder::AddMemoryList_32(std::vector<CoreFileMemoryRange> &ranges,
     descriptors.push_back(descriptor);
     if (m_thread_by_range_end.count(end) > 0)
       m_thread_by_range_end[end].Stack = descriptor;
-
-    // Add the data to the buffer, flush as needed.
-    error = AddData(data_up->GetBytes(), bytes_read);
-    if (error.Fail())
-      return error;
   }
 
   // Add a directory that references this list
@@ -1106,8 +1166,6 @@ MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
 
   Log *log = GetLog(LLDBLog::Object);
   size_t region_index = 0;
-  auto data_up =
-      std::make_unique<DataBufferHeap>(GetLargestRangeSize(ranges), 0);
   for (const auto &core_range : ranges) {
     const addr_t addr = core_range.range.start();
     const addr_t size = core_range.range.size();
@@ -1120,27 +1178,19 @@ MinidumpFileBuilder::AddMemoryList_64(std::vector<CoreFileMemoryRange> &ranges,
     ++region_index;
 
     progress.Increment(1, "Adding Memory Range " + core_range.Dump());
-    const size_t bytes_read =
-        m_process_sp->ReadMemory(addr, data_up->GetBytes(), size, error);
-    if (error.Fail()) {
-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
-                bytes_read, error.AsCString());
-      error.Clear();
+    uint64_t bytes_read;
+    error = ReadWriteMemoryInChunks(core_range, &bytes_read);
+    if (error.Fail())
+      return error;
+
+    if (bytes_read == 0) {
       cleanup_required = true;
       descriptors[region_index].DataSize = 0;
     }
     if (bytes_read != size) {
-      LLDB_LOGF(
-          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " bytes",
-          addr, size);
       cleanup_required = true;
       descriptors[region_index].DataSize = bytes_read;
     }
-
-    // Add the data to the buffer, flush as needed.
-    error = AddData(data_up->GetBytes(), bytes_read);
-    if (error.Fail())
-      return error;
   }
 
   // Early return if there is no cleanup needed.
diff --git a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
index 48293ee1bf5e5..b9ebb0d9a28b7 100644
--- a/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
+++ b/lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h
@@ -142,6 +142,13 @@ class MinidumpFileBuilder {
   lldb_private::Status AddDirectory(llvm::minidump::StreamType type,
                                     uint64_t stream_size);
   lldb::offset_t GetCurrentDataEndOffset() const;
+
+  // Read a memory region from the process and write it to the file
+  // in fixed size chunks.
+  lldb_private::Status
+  ReadWriteMemoryInChunks(const lldb_private::CoreFileMemoryRange &range,
+                          uint64_t *bytes_read);
+
   // Stores directories to fill in later
   std::vector<llvm::minidump::Directory> m_directories;
   // When we write off the threads for the first time, we need to clean them up

Jlalond · 2025-02-28T21:39:10Z

@labath / @clayborg to preempt some expected feedback, I intend to expose this chunk size in a future setting

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

jeffreytan81 · 2025-03-08T00:41:24Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    const size_t bytes_to_read =
+        std::min(bytes_remaining, MAX_WRITE_CHUNK_SIZE);


I am not a big fan of this -- you are trying to ensure the reading won't exceed data_up buffer size, then you should use data_up->GetSize() here instead of MAX_WRITE_CHUNK_SIZE. What is the data_up is smaller than MAX_WRITE_CHUNK_SIZE? Then you are relying on the fact that size from the range ensures data_up size. You can see there are two much coupling logic between caller and callee here.

I would change to std::min(bytes_remaining, data_up->GetSize());. That's the purpose of passing data_up into this function anyway, otherwise, we should pass data_up->GetBytes() into this function.

Correct. This was a missed change on my part, I moved the buffer declaration back up into the AddMemory_ methods, and we should've used the upper bound of the buffer here, not MAX_WRITE. I have fixed accordingly.

jeffreytan81 · 2025-03-12T01:55:02Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+
+      // If we've read some bytes, we stop trying to read more and return
+      // this best effort attempt
+      bytes_remaining = 0;


Why not simply return from here? This logic relies on the indirect condition that loop terminates with "bytes_remaining = 0", which can easily break during future refactoring.

Also, the later code still executes "AddData(data_up->GetBytes(), bytes_read_for_chunk);" after error.Fail() then overwrites error object. Another reason we should early return here.

Hey @jeffreytan81, my thought process here is we managed to read any bytes, we should append those to the buffer and return bytes read up to the point.

So in my hypothetical we try to read 1000 bytes from an address, and only read 500. I still wanted to save off those 500. Are you against this proposal? It's why in error conditions other than AddData() I just set the loop to not continue.

I'm okay with another exit statement, the bytes_remaining == 0 does feel quite brittle. But my primary concern here is do we include or exclude partial reads. In the existing system we would just skip them. So there is a precedence for it, I think by trying to include the extra bytes we can improve the user experience.

I will update the check that if bytes_read_for_chunk == 0 we also just quick return.

Updated the error case to be an early but non fatal exit. I added some comments why it's non fatal while AddData() is. I added an explicit case for partial reads, where we set a flag that will terminate the loop, but we still add the data, update the bytes_read count and then discontinue any further reading.

I also changed the overflow to an error, which I'm much happier with.

jeffreytan81 · 2025-03-12T01:56:08Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+          log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " bytes",
+          addr, size);


Logging the fact of partial read, and bytes_read_for_chunk variable instead of simply saying "failed to read".

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2025-03-04T22:45:06Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-  for (const auto &core_range : ranges)
-    max_size = std::max(max_size, core_range.range.size());
-  return max_size;
+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(


Can we build this into PostMortemProcess? It would be nice to not have to replicate this funtionality in every core file implementation.

Not opposed, but because this patch has been open for quite some time can I make that a follow up patch?

clayborg · 2025-03-04T22:46:27Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-    max_size = std::max(max_size, core_range.range.size());
-  return max_size;
+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
+    const lldb_private::CoreFileMemoryRange &range, uint64_t *bytes_read) {


change uint64_t *bytes_read to uint64_t &bytes_read

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2025-03-14T19:56:16Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

@@ -969,6 +969,83 @@ Status MinidumpFileBuilder::DumpDirectories() const {
  return error;
 }

+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
+    const std::unique_ptr<lldb_private::DataBufferHeap> &data_up,
+    const lldb_private::CoreFileMemoryRange &range, uint64_t *bytes_read) {


Change this to a uint64_t &bytes_read

clayborg · 2025-03-14T19:57:03Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

@@ -969,6 +969,83 @@ Status MinidumpFileBuilder::DumpDirectories() const {
  return error;
 }

+Status MinidumpFileBuilder::ReadWriteMemoryInChunks(
+    const std::unique_ptr<lldb_private::DataBufferHeap> &data_up,


Change this to a lldb_private::DataBufferHeap &data. Unless we are doing something with the unique_ptr here that makes sense, there is no reason to use a unique pointer in the args.

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2025-03-14T20:03:48Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  auto data_up = std::make_unique<DataBufferHeap>(
+      std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);
  for (const auto &core_range : ranges) {


No need to create a unique pointer here. Just create a local object:

DataBufferHeap data(std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);

clayborg · 2025-03-14T20:04:28Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

-      LLDB_LOGF(log, "Failed to read memory region. Bytes read: %zu, error: %s",
-                bytes_read, error.AsCString());
-      // Just skip sections with errors or zero bytes in 32b mode
+    uint64_t bytes_read;


Init with zero:

uint64_t bytes_read = 0;

clayborg · 2025-03-14T20:05:20Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  auto data_up = std::make_unique<DataBufferHeap>(
+      std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);


Don't create a unique pointer, just create an instance like above

clayborg · 2025-03-14T20:12:02Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.h

+
+  // Read a memory region from the process and write it to the file
+  // in fixed size chunks.
+  lldb_private::Status ReadWriteMemoryInChunks(
+      const std::unique_ptr<lldb_private::DataBufferHeap> &data_up,
+      const lldb_private::CoreFileMemoryRange &range, uint64_t *bytes_read);
+


Move this into the PostMortemProcess.h/.cpp and have it take a lambda callback:

lldb_private::Status ReadMemoryInChunks( lldb_private::CoreFileMemoryRange &range, uint64_t chunk_size, lldb_private::DataBufferHeap &data, // Re-use between calls if needed, might expand to chunk_size const std::function<Status(const uint8_t *bytes, uint64_t length)> &chunk_callback, uint64_t &bytes_read);

Then chunk_callback gets called for each chunk that is read and your ProcessMinidump can pass in a lambda that just calls AddData(...)

There is no need to create a const std::unique_ptr<lldb_private::DataBufferHeap> &data_up, just create a local lldb_private::DataBufferHeap heap(...). The unique pointer does nothing useful here

Talked with Greg offline, I want to add this to process, but the effort to unit test it will delay this patch more than I would like, so I'll follow up this patch with a refactor of this beahvior into Process

Jlalond · 2025-04-01T16:53:18Z

Update

After discussing with Greg, I've centralized the read memory loop logic in Process, and defined a callback that has all the relevant information from the last memory read. I've refactored the Minidump method to capture the fatal error in the outer scope and let the underlying process implementation handle the looping and reading.

I also have found we're using ReadMemory (not just in Minidump) when we should be using ReadMemoryFromInferior, so I'll have several follow up patches to fix this.

clayborg · 2025-04-07T17:40:26Z

lldb/source/Target/Process.cpp

@@ -2233,6 +2233,40 @@ size_t Process::ReadMemoryFromInferior(addr_t addr, void *buf, size_t size,
  return bytes_read;
 }

+size_t Process::ReadMemoryInChunks(lldb::addr_t vm_addr, DataBufferHeap &data,


There is no need to pass in a DataBufferHeap into this API. The API should just take the chunk size as a parameter:

size_t Process::ReadMemoryInChunks(lldb::addr_t vm_addr, lldb::addr_t vm_size, lldb::addr_t chunk_size, ReadMemoryChunkCallback callback);

Then you can create a local DataBufferHeap object in this function outside of the loop that reads the chunks, so the same buffer gets re-used.

clayborg · 2025-04-07T17:52:47Z

lldb/include/lldb/Target/Process.h

+  typedef std::function<IterationAction(lldb_private::Status &,
+                                        lldb_private::DataBufferHeap &,
+                                        lldb::addr_t, uint64_t, uint64_t)>


We should just pass the bytes without requiring someone to use a DataBufferHeap. I suggest a change below that passes the chunk size into the Process::ReadMemoryInChunks(...) call, so we don't need the bytes_to_read argument. So this callback can be:

typedef std::function<IterationAction( Status &error, /// The error for the current memory chunk that was read lldb::addr_t bytes_addr, /// The address corresponding to the bytes supplied const uint8_t *bytes, /// The bytes for the current chunk uint64_t bytes_size, /// The number of bytes in the bytes argument )>

clayborg · 2025-04-07T17:54:32Z

lldb/include/lldb/Target/Process.h

+  size_t ReadMemoryInChunks(lldb::addr_t vm_addr, DataBufferHeap &data,
+                            size_t size, ReadMemoryChunkCallback callback);


No need to have people give is a DataBufferHeap, just need the chunk size:

size_t ReadMemoryInChunks(lldb::addr_t vm_addr, lldb::offset_t vm_size, lldb::offset_t chunk_size, ReadMemoryChunkCallback callback);

The ReadMemoryInChunks function can create a local DataBufferHeap outside of the loop.

clayborg · 2025-04-07T17:55:36Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+      [&](Status &error, DataBufferHeap &data, lldb::addr_t current_addr,
+          uint64_t bytes_to_read,
+          uint64_t bytes_read_for_chunk) -> lldb_private::IterationAction {


Update to suggested callback prototype above.

clayborg · 2025-04-07T17:56:41Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    if (addDataError.Fail())
+      return lldb_private::IterationAction::Stop;
+
+    if (bytes_read_for_chunk != bytes_to_read) {


You will pass in a chunk_size which can be captured in the lambda and you compare the "bytes_size" to your "chunk_size".

We're already passing bytes_read_for_chunk and bytes_to_read which is the expected size, so I didn't need to change the lambda here

We don't want bytes_to_read in the callback, it isn't needed. Use data_buffer->GetByteSize() instead, you are capturing anything needed by reference in your lambda already.

…ve data_up memory allocation back to the add memory functions and more intelligently allocate the buffer

…es_read* is not null

…for the partial read exit case. Additionally add an expanded comment of how we handle errors encountered during reading, and why we don't set the *bytes_read to 0.

…tial read exit to the end of the loop

Run GCF

lldb/include/lldb/Target/Process.h

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2025-04-07T23:12:59Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    if (addDataError.Fail())
+      return lldb_private::IterationAction::Stop;
+
+    if (bytes_read_for_chunk != bytes_to_read) {


We don't want bytes_to_read in the callback, it isn't needed. Use data_buffer->GetByteSize() instead, you are capturing anything needed by reference in your lambda already.

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

clayborg · 2025-04-07T23:37:18Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+  Status addDataError;
+  Process::ReadMemoryChunkCallback callback =
+      [&](Status &error, lldb::addr_t current_addr, const void *buf,
+          uint64_t bytes_read_for_chunk) -> lldb_private::IterationAction {


We can now rename bytes_read_for_chunk to bytes_read

clayborg · 2025-04-07T23:37:57Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+    if (bytes_read_for_chunk != data_buffer.GetByteSize() &&
+        current_addr + bytes_read_for_chunk != size) {
+      LLDB_LOGF(log,
+                "Memory region at: %" PRIx64 " partiall read %" PRIx64


s/partiall/partial/

You also want all addresses to be prefixed with 0x here, so the format string should be:

"Memory region at: 0x%" PRIx64 " partiall read 0x%" PRIx64

Same for format string below.

clayborg · 2025-04-07T23:40:10Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+          uint64_t bytes_read_for_chunk) -> lldb_private::IterationAction {
+    if (error.Fail() || bytes_read_for_chunk == 0) {
+      LLDB_LOGF(log,
+                "Failed to read memory region at: %" PRIx64


You want addresses to have 0x prefix, so:

"Failed to read memory region at: 0x%" PRIx64

clayborg · 2025-04-07T23:40:46Z

lldb/include/lldb/Target/Process.h

+  ///     vm_addr, \a buf, and \a size updated appropriately. Zero is
+  ///     returned in the case of an error.
+  size_t ReadMemoryInChunks(lldb::addr_t vm_addr, void *buf,
+                            lldb::addr_t chunk_size, size_t total_size,


Don't use size_t, this is 32 bits on 32 bit systems. Use lldb::offset_t or 'lldb::addr_t` which is 64 bits all the time.

clayborg · 2025-04-07T23:41:35Z

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp

+                ". Bytes read: %zu, error: %s",
+                current_addr, bytes_read_for_chunk, error.AsCString());


Don't use size_t, use lldb::offset_t:

". Bytes read: %" PRIu64 ", error: %s", current_addr, bytes_read_for_chunk, error.AsCString());

clayborg · 2025-04-07T23:45:34Z

lldb/include/lldb/Target/Process.h

+  ///     size, then this function will get called again with \a
+  ///     vm_addr, \a buf, and \a size updated appropriately. Zero is
+  ///     returned in the case of an error.
+  size_t ReadMemoryInChunks(lldb::addr_t vm_addr, void *buf,


Don't use size_t, it is 32 bits on 32 bits architectures. Use lldb::offset_t which is always 64 bits.

clayborg · 2025-04-07T23:45:50Z

lldb/source/Target/Process.cpp

+    // the write chunk max size.
+    const size_t bytes_to_read = std::min(bytes_remaining, chunk_size);
+    const lldb::addr_t current_addr = vm_addr + bytes_read;
+    const size_t bytes_read_for_chunk =


Don't use size_t

lldb/source/Target/Process.cpp

github-actions · 2025-04-08T00:16:05Z

✅ With the latest revision this PR passed the C/C++ code formatter.

clayborg · 2025-04-08T00:25:38Z

lldb/include/lldb/Target/Process.h

+  // Status, the status returned from ReadMemoryFromInferior
+  // void*, pointer to the bytes read
+  // addr_t, the bytes_addr, start + bytes read so far.
+  // bytes_size, the count of bytes read for this chunk


Re-order comments here to match new args ordering

…e read chunk size. Also don't log if a truncated read is the final read

Jlalond · 2025-04-08T16:47:48Z

Did some additional validation offline, where I save a jumbo Minidump (50gb). We have improved disk IO and a flat memory profile.

@clayborg

In #129307, we introduced read write in chunks, and during the final revision of the PR I changed the behavior for 64b memory regions and did not test an actual 64b memory range. This caused LLDB to crash whenever we generated a 64b memory region. 64b regions has been a problem in testing for some time as it's a waste of test resources to generation a 5gb+ Minidump. I will work with @clayborg and @labath to come up with a way to specify creating a 64b list instead of a 32b list (likely via the yamilizer).

@clayborg

In #129307, we introduced read write in chunks, and during the final revision of the PR I changed the behavior for 64b memory regions and did not test an actual 64b memory range. This caused LLDB to crash whenever we generated a 64b memory region. 64b regions has been a problem in testing for some time as it's a waste of test resources to generation a 5gb+ Minidump. I will work with @clayborg and @labath to come up with a way to specify creating a 64b list instead of a 32b list (likely via the yamilizer).

@clayborg

) In llvm#129307, we introduced read write in chunks, and during the final revision of the PR I changed the behavior for 64b memory regions and did not test an actual 64b memory range. This caused LLDB to crash whenever we generated a 64b memory region. 64b regions has been a problem in testing for some time as it's a waste of test resources to generation a 5gb+ Minidump. I will work with @clayborg and @labath to come up with a way to specify creating a 64b list instead of a 32b list (likely via the yamilizer).

@clayborg

) In llvm#129307, we introduced read write in chunks, and during the final revision of the PR I changed the behavior for 64b memory regions and did not test an actual 64b memory range. This caused LLDB to crash whenever we generated a 64b memory region. 64b regions has been a problem in testing for some time as it's a waste of test resources to generation a 5gb+ Minidump. I will work with @clayborg and @labath to come up with a way to specify creating a 64b list instead of a 32b list (likely via the yamilizer).

Jlalond requested a review from JDevlieghere as a code owner February 28, 2025 21:02

llvmbot added the lldb label Feb 28, 2025

Jlalond changed the title ~~Update MinidumpFileBuilder to read and write in chunks~~ [LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks Feb 28, 2025

Jlalond requested review from labath, clayborg, mbucko and jeffreytan81 and removed request for JDevlieghere February 28, 2025 21:34

jeffreytan81 reviewed Mar 3, 2025

View reviewed changes

lldb/source/Plugins/ObjectFile/Minidump/MinidumpFileBuilder.cpp Outdated Show resolved Hide resolved

mbucko requested changes Mar 3, 2025

View reviewed changes

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from 1b6444a to 4d55b06 Compare March 3, 2025 19:58

jeffreytan81 reviewed Mar 8, 2025

View reviewed changes

jeffreytan81 requested changes Mar 12, 2025

View reviewed changes

clayborg requested changes Mar 14, 2025

View reviewed changes

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from a3ef677 to 63ddb80 Compare March 14, 2025 21:40

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from 63ddb80 to 42e84ed Compare April 1, 2025 16:43

clayborg requested changes Apr 7, 2025

View reviewed changes

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from 5af0047 to b35e42f Compare April 7, 2025 22:35

Jlalond added 9 commits April 7, 2025 15:35

Update MinidumpFileBuilder to read and write in chunks

0d40497

Comment cleanup on the 0 bytes read error case

5c782cd

Add one more overflow safety check, const some stuff

a40d40d

Discussed with Jeffrey offline, fix clear causing buffer overflow. Mo…

50304ee

…ve data_up memory allocation back to the add memory functions and more intelligently allocate the buffer

Finish refactor of data_up being the max buffer size, and enforce byt…

60979a1

…es_read* is not null

Convert the soft exit cases into hard exit conditions and add a flag …

4818677

…for the partial read exit case. Additionally add an expanded comment of how we handle errors encountered during reading, and why we don't set the *bytes_read to 0.

Refactor to Greg's suggestions, drop uniq_ptr/uint64_t*. Move the par…

c5b1f49

…tial read exit to the end of the loop

Add read in chunks to process

0d31bb6

Update Minidump file builder to use new API

aa23a39

Run GCF

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from b35e42f to f01e8c1 Compare April 7, 2025 22:36

Move to void* buffer instead of data buffer heap

13581a6

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from f01e8c1 to 13581a6 Compare April 7, 2025 22:45

clayborg requested changes Apr 7, 2025

View reviewed changes

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch 2 times, most recently from fea4d05 to cab1efa Compare April 8, 2025 00:13

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from cab1efa to 4aed212 Compare April 8, 2025 00:24

clayborg approved these changes Apr 8, 2025

View reviewed changes

Redo callback prototype and refactor to not pass expected size and th…

a85c33e

…e read chunk size. Also don't log if a truncated read is the final read

Jlalond force-pushed the minidump-file-builder-read-in-chunks branch from 4aed212 to a85c33e Compare April 8, 2025 00:29

Jlalond merged commit f869d6e into llvm:main Apr 8, 2025
10 checks passed

Jlalond deleted the minidump-file-builder-read-in-chunks branch April 8, 2025 18:17

Jlalond mentioned this pull request May 29, 2025

[LLDB][Minidump] Fix bug in generating 64b memory minidumps #141995

Merged

		const size_t bytes_to_read =
		std::min(bytes_remaining, MAX_WRITE_CHUNK_SIZE);

		log, "Memory region at: %" PRIx64 " failed to read %" PRIx64 " bytes",
		addr, size);

		auto data_up = std::make_unique<DataBufferHeap>(
		std::min(GetLargestRangeSize(ranges), MAX_WRITE_CHUNK_SIZE), 0);

		size_t ReadMemoryInChunks(lldb::addr_t vm_addr, DataBufferHeap &data,
		size_t size, ReadMemoryChunkCallback callback);

		". Bytes read: %zu, error: %s",
		current_addr, bytes_read_for_chunk, error.AsCString());

[LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks #129307

[LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks #129307

Uh oh!

Conversation

Jlalond commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 28, 2025

Uh oh!

Jlalond commented Feb 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jlalond commented Apr 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jlalond commented Feb 28, 2025 •

edited

Loading

github-actions bot commented Apr 8, 2025 •

edited

Loading