Skip to content

Reland "[lldb] Implement basic support for reverse-continue" (#123906)" #123945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 30, 2025

Conversation

DavidSpickett
Copy link
Collaborator

@DavidSpickett DavidSpickett commented Jan 22, 2025

This reverts commit 22561cf and fixes b7b9ccf (#112079).

The problem is that x86_64 and Arm 32-bit have memory regions above the stack that are readable but not writeable. First Arm:

(lldb) memory region --all
<...>
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

Then x86_64:

$ cat /proc/self/maps
<...>
7ffdcd148000-7ffdcd16a000 rw-p 00000000 00:00 0                          [stack]
7ffdcd193000-7ffdcd196000 r--p 00000000 00:00 0                          [vvar]
7ffdcd196000-7ffdcd197000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

Compare this to AArch64 where the test did pass:

$ cat /proc/self/maps
<...>
ffffb87dc000-ffffb87dd000 r--p 00000000 00:00 0                          [vvar]
ffffb87dd000-ffffb87de000 r-xp 00000000 00:00 0                          [vdso]
ffffb87de000-ffffb87e0000 r--p 0002a000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffffb87e0000-ffffb87e2000 rw-p 0002c000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
fffff4216000-fffff4237000 rw-p 00000000 00:00 0                          [stack]

To solve this, look up the memory region of the stack pointer (using https://lldb.llvm.org/resources/lldbgdbremote.html#qmemoryregioninfo-addr) and constrain the read to within that region. Since we know the stack is all readable and writeable.

I have also added skipIfRemote to the tests, since getting them working in that context is too complex to be worth it.

Memory write failures now display the range they tried to write, and register write errors will show the name of the register where possible.

The patch also includes a workaround for a an issue where the test code could mistake an x response that happens to begin with an O for an output packet (stdout). This workaround will not be necessary one we start using the new implementation of the x packet.

…3906)"

This reverts commit 22561cf and fixes
b7b9ccf (llvm#112079).

The problem is that x86_64 and Arm 32-bit have memory regions above
the stack that are readable but not writeable. First Arm:
```
(lldb) memory region --all
<...>
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---
```
Then x86_64:
```
$ cat /proc/self/maps
<...>
7ffdcd148000-7ffdcd16a000 rw-p 00000000 00:00 0                          [stack]
7ffdcd193000-7ffdcd196000 r--p 00000000 00:00 0                          [vvar]
7ffdcd196000-7ffdcd197000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]
```
Compare this to AArch64 where the test did pass:
```
$ cat /proc/self/maps
<...>
ffffb87dc000-ffffb87dd000 r--p 00000000 00:00 0                          [vvar]
ffffb87dd000-ffffb87de000 r-xp 00000000 00:00 0                          [vdso]
ffffb87de000-ffffb87e0000 r--p 0002a000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffffb87e0000-ffffb87e2000 rw-p 0002c000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
fffff4216000-fffff4237000 rw-p 00000000 00:00 0                          [stack]
```
To solve this, look up the memory region of the stack pointer and constrain
the read to within that region. Since we know the stack is all readable
and writeable.
@llvmbot
Copy link
Member

llvmbot commented Jan 22, 2025

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

This reverts commit 22561cf and fixes b7b9ccf (#112079).

The problem is that x86_64 and Arm 32-bit have memory regions above the stack that are readable but not writeable. First Arm:

(lldb) memory region --all
&lt;...&gt;
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

Then x86_64:

$ cat /proc/self/maps
&lt;...&gt;
7ffdcd148000-7ffdcd16a000 rw-p 00000000 00:00 0                          [stack]
7ffdcd193000-7ffdcd196000 r--p 00000000 00:00 0                          [vvar]
7ffdcd196000-7ffdcd197000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

Compare this to AArch64 where the test did pass:

$ cat /proc/self/maps
&lt;...&gt;
ffffb87dc000-ffffb87dd000 r--p 00000000 00:00 0                          [vvar]
ffffb87dd000-ffffb87de000 r-xp 00000000 00:00 0                          [vdso]
ffffb87de000-ffffb87e0000 r--p 0002a000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffffb87e0000-ffffb87e2000 rw-p 0002c000 00:3c 76927217                   /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
fffff4216000-fffff4237000 rw-p 00000000 00:00 0                          [stack]

To solve this, look up the memory region of the stack pointer (using https://lldb.llvm.org/resources/lldbgdbremote.html#qmemoryregioninfo-addr) and constrain the read to within that region. Since we know the stack is all readable and writeable.


Patch is 85.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123945.diff

40 Files Affected:

  • (modified) lldb/include/lldb/API/SBProcess.h (+1)
  • (modified) lldb/include/lldb/Target/Process.h (+26-2)
  • (modified) lldb/include/lldb/Target/StopInfo.h (+7)
  • (modified) lldb/include/lldb/Target/Thread.h (+4-5)
  • (modified) lldb/include/lldb/Target/ThreadList.h (+5-1)
  • (modified) lldb/include/lldb/Target/ThreadPlan.h (+13)
  • (modified) lldb/include/lldb/Target/ThreadPlanBase.h (+2)
  • (modified) lldb/include/lldb/lldb-enumerations.h (+6)
  • (modified) lldb/packages/Python/lldbsuite/test/gdbclientutils.py (+3-2)
  • (added) lldb/packages/Python/lldbsuite/test/lldbgdbproxy.py (+175)
  • (added) lldb/packages/Python/lldbsuite/test/lldbreverse.py (+528)
  • (modified) lldb/packages/Python/lldbsuite/test/lldbtest.py (+2)
  • (modified) lldb/packages/Python/lldbsuite/test/tools/lldb-server/lldbgdbserverutils.py (+9-5)
  • (modified) lldb/source/API/SBProcess.cpp (+12)
  • (modified) lldb/source/API/SBThread.cpp (+2)
  • (modified) lldb/source/Interpreter/CommandInterpreter.cpp (+2-1)
  • (modified) lldb/source/Plugins/Process/Linux/NativeThreadLinux.cpp (+3)
  • (modified) lldb/source/Plugins/Process/MacOSX-Kernel/ProcessKDP.cpp (+7-1)
  • (modified) lldb/source/Plugins/Process/MacOSX-Kernel/ProcessKDP.h (+1-1)
  • (modified) lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp (+8-1)
  • (modified) lldb/source/Plugins/Process/Windows/Common/ProcessWindows.h (+1-1)
  • (modified) lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp (+20)
  • (modified) lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h (+6)
  • (modified) lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp (+1)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp (+85-13)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h (+3-1)
  • (modified) lldb/source/Plugins/Process/scripted/ScriptedProcess.cpp (+7-2)
  • (modified) lldb/source/Plugins/Process/scripted/ScriptedProcess.h (+1-1)
  • (modified) lldb/source/Target/Process.cpp (+20-4)
  • (modified) lldb/source/Target/StopInfo.cpp (+28)
  • (modified) lldb/source/Target/Thread.cpp (+6-3)
  • (modified) lldb/source/Target/ThreadList.cpp (+29-3)
  • (modified) lldb/source/Target/ThreadPlanBase.cpp (+4)
  • (added) lldb/test/API/functionalities/reverse-execution/Makefile (+3)
  • (added) lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py (+149)
  • (added) lldb/test/API/functionalities/reverse-execution/TestReverseContinueNotSupported.py (+31)
  • (added) lldb/test/API/functionalities/reverse-execution/TestReverseContinueWatchpoints.py (+130)
  • (added) lldb/test/API/functionalities/reverse-execution/main.c (+25)
  • (modified) lldb/tools/lldb-dap/JSONUtils.cpp (+3)
  • (modified) lldb/tools/lldb-dap/LLDBUtils.cpp (+1)
diff --git a/lldb/include/lldb/API/SBProcess.h b/lldb/include/lldb/API/SBProcess.h
index 1624e02070b1b2..882b8bd837131d 100644
--- a/lldb/include/lldb/API/SBProcess.h
+++ b/lldb/include/lldb/API/SBProcess.h
@@ -159,6 +159,7 @@ class LLDB_API SBProcess {
   lldb::SBError Destroy();
 
   lldb::SBError Continue();
+  lldb::SBError ContinueInDirection(lldb::RunDirection direction);
 
   lldb::SBError Stop();
 
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index a184e6dd891aff..b14eb3fbd91d00 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1089,6 +1089,13 @@ class Process : public std::enable_shared_from_this<Process>,
   ///     Returns an error object.
   virtual Status WillResume() { return Status(); }
 
+  /// Reports whether this process supports reverse execution.
+  ///
+  /// \return
+  ///     Returns true if the process supports reverse execution (at least
+  /// under some circumstances).
+  virtual bool SupportsReverseDirection() { return false; }
+
   /// Resumes all of a process's threads as configured using the Thread run
   /// control functions.
   ///
@@ -1104,9 +1111,13 @@ class Process : public std::enable_shared_from_this<Process>,
   /// \see Thread:Resume()
   /// \see Thread:Step()
   /// \see Thread:Suspend()
-  virtual Status DoResume() {
+  virtual Status DoResume(lldb::RunDirection direction) {
+    if (direction == lldb::RunDirection::eRunForward)
+      return Status::FromErrorStringWithFormatv(
+          "error: {0} does not support resuming processes", GetPluginName());
     return Status::FromErrorStringWithFormatv(
-        "error: {0} does not support resuming processes", GetPluginName());
+        "error: {0} does not support reverse execution of processes",
+        GetPluginName());
   }
 
   /// Called after resuming a process.
@@ -2676,6 +2687,18 @@ void PruneThreadPlans();
                             const AddressRange &range, size_t alignment,
                             Status &error);
 
+  /// Get the base run direction for the process.
+  /// The base direction is the direction the process will execute in
+  /// (forward or backward) if no thread plan overrides the direction.
+  lldb::RunDirection GetBaseDirection() const { return m_base_direction; }
+  /// Set the base run direction for the process.
+  /// As a side-effect, if this changes the base direction, then we
+  /// discard all non-base thread plans to ensure that when execution resumes
+  /// we definitely execute in the requested direction.
+  /// FIXME: this is overkill. In some situations ensuring the latter
+  /// would not require discarding all non-base thread plans.
+  void SetBaseDirection(lldb::RunDirection direction);
+
 protected:
   friend class Trace;
 
@@ -3075,6 +3098,7 @@ void PruneThreadPlans();
   ThreadList
       m_extended_thread_list; ///< Constituent for extended threads that may be
                               /// generated, cleared on natural stops
+  lldb::RunDirection m_base_direction; ///< ThreadPlanBase run direction
   uint32_t m_extended_thread_stop_id; ///< The natural stop id when
                                       ///extended_thread_list was last updated
   QueueList
diff --git a/lldb/include/lldb/Target/StopInfo.h b/lldb/include/lldb/Target/StopInfo.h
index 45beac129e86f7..9a13371708be52 100644
--- a/lldb/include/lldb/Target/StopInfo.h
+++ b/lldb/include/lldb/Target/StopInfo.h
@@ -20,6 +20,7 @@ namespace lldb_private {
 class StopInfo : public std::enable_shared_from_this<StopInfo> {
   friend class Process::ProcessEventData;
   friend class ThreadPlanBase;
+  friend class ThreadPlanReverseContinue;
 
 public:
   // Constructors and Destructors
@@ -154,6 +155,12 @@ class StopInfo : public std::enable_shared_from_this<StopInfo> {
   static lldb::StopInfoSP
   CreateStopReasonProcessorTrace(Thread &thread, const char *description);
 
+  // This creates a StopInfo indicating that execution stopped because
+  // it was replaying some recorded execution history, and execution reached
+  // the end of that recorded history.
+  static lldb::StopInfoSP
+  CreateStopReasonHistoryBoundary(Thread &thread, const char *description);
+
   static lldb::StopInfoSP CreateStopReasonFork(Thread &thread,
                                                lldb::pid_t child_pid,
                                                lldb::tid_t child_tid);
diff --git a/lldb/include/lldb/Target/Thread.h b/lldb/include/lldb/Target/Thread.h
index ef66fa11574db9..cd82ee7d756030 100644
--- a/lldb/include/lldb/Target/Thread.h
+++ b/lldb/include/lldb/Target/Thread.h
@@ -200,14 +200,13 @@ class Thread : public std::enable_shared_from_this<Thread>,
   ///    The User resume state for this thread.
   lldb::StateType GetResumeState() const { return m_resume_state; }
 
-  /// This function is called on all the threads before "ShouldResume" and
-  /// "WillResume" in case a thread needs to change its state before the
-  /// ThreadList polls all the threads to figure out which ones actually will
-  /// get to run and how.
+  // This function is called to determine whether the thread needs to
+  // step over a breakpoint and if so, push a step-over-breakpoint thread
+  // plan.
   ///
   /// \return
   ///    True if we pushed a ThreadPlanStepOverBreakpoint
-  bool SetupForResume();
+  bool SetupToStepOverBreakpointIfNeeded(lldb::RunDirection direction);
 
   // Do not override this function, it is for thread plan logic only
   bool ShouldResume(lldb::StateType resume_state);
diff --git a/lldb/include/lldb/Target/ThreadList.h b/lldb/include/lldb/Target/ThreadList.h
index f931bb83a8ceaf..c796975de60153 100644
--- a/lldb/include/lldb/Target/ThreadList.h
+++ b/lldb/include/lldb/Target/ThreadList.h
@@ -115,6 +115,10 @@ class ThreadList : public ThreadCollection {
   /// If a thread can "resume" without having to resume the target, it
   /// will return false for WillResume, and then the process will not be
   /// restarted.
+  /// Sets *direction to the run direction of the thread(s) that will
+  /// be resumed. If threads that we want to run disagree about the
+  /// direction, we execute forwards and pop any of the thread plans
+  /// that requested reverse execution.
   ///
   /// \return
   ///    \b true instructs the process to resume normally,
@@ -122,7 +126,7 @@ class ThreadList : public ThreadCollection {
   ///    the process will not actually run.  The thread must then return
   ///    the correct StopInfo when asked.
   ///
-  bool WillResume();
+  bool WillResume(lldb::RunDirection &direction);
 
   void DidResume();
 
diff --git a/lldb/include/lldb/Target/ThreadPlan.h b/lldb/include/lldb/Target/ThreadPlan.h
index d6da484f4fc137..a7bac8cc5ecf6c 100644
--- a/lldb/include/lldb/Target/ThreadPlan.h
+++ b/lldb/include/lldb/Target/ThreadPlan.h
@@ -283,6 +283,15 @@ namespace lldb_private {
 //  report_run_vote argument to the constructor works like report_stop_vote, and
 //  is a way for a plan to instruct a sub-plan on how to respond to
 //  ShouldReportStop.
+//
+//  Reverse execution:
+//
+//  Every thread plan has an associated RunDirection (forward or backward).
+//  For ThreadPlanBase, this direction is the Process's base direction.
+//  Whenever we resume the target, we need to ensure that the topmost thread
+//  plans for each runnable thread all agree on their direction. This is
+//  ensured in ThreadList::WillResume(), which chooses a direction and then
+//  discards thread plans incompatible with that direction.
 
 class ThreadPlan : public std::enable_shared_from_this<ThreadPlan>,
                    public UserID {
@@ -497,6 +506,10 @@ class ThreadPlan : public std::enable_shared_from_this<ThreadPlan>,
 
   virtual lldb::StateType GetPlanRunState() = 0;
 
+  virtual lldb::RunDirection GetDirection() const {
+    return lldb::RunDirection::eRunForward;
+  }
+
 protected:
   // Constructors and Destructors
   ThreadPlan(ThreadPlanKind kind, const char *name, Thread &thread,
diff --git a/lldb/include/lldb/Target/ThreadPlanBase.h b/lldb/include/lldb/Target/ThreadPlanBase.h
index 5c44b9fb17b271..f4418d779a4dab 100644
--- a/lldb/include/lldb/Target/ThreadPlanBase.h
+++ b/lldb/include/lldb/Target/ThreadPlanBase.h
@@ -38,6 +38,8 @@ class ThreadPlanBase : public ThreadPlan {
 
   bool IsBasePlan() override { return true; }
 
+  lldb::RunDirection GetDirection() const override;
+
 protected:
   bool DoWillResume(lldb::StateType resume_state, bool current_plan) override;
   bool DoPlanExplainsStop(Event *event_ptr) override;
diff --git a/lldb/include/lldb/lldb-enumerations.h b/lldb/include/lldb/lldb-enumerations.h
index 50d2233509de6f..5f12e648684d7f 100644
--- a/lldb/include/lldb/lldb-enumerations.h
+++ b/lldb/include/lldb/lldb-enumerations.h
@@ -135,6 +135,9 @@ FLAGS_ENUM(LaunchFlags){
 /// Thread Run Modes.
 enum RunMode { eOnlyThisThread, eAllThreads, eOnlyDuringStepping };
 
+/// Execution directions
+enum RunDirection { eRunForward, eRunReverse };
+
 /// Byte ordering definitions.
 enum ByteOrder {
   eByteOrderInvalid = 0,
@@ -254,6 +257,9 @@ enum StopReason {
   eStopReasonVFork,
   eStopReasonVForkDone,
   eStopReasonInterrupt, ///< Thread requested interrupt
+  // Indicates that execution stopped because the debugger backend relies
+  // on recorded data and we reached the end of that data.
+  eStopReasonHistoryBoundary,
 };
 
 /// Command Return Status Types.
diff --git a/lldb/packages/Python/lldbsuite/test/gdbclientutils.py b/lldb/packages/Python/lldbsuite/test/gdbclientutils.py
index 1784487323ad6b..732d6171320680 100644
--- a/lldb/packages/Python/lldbsuite/test/gdbclientutils.py
+++ b/lldb/packages/Python/lldbsuite/test/gdbclientutils.py
@@ -510,8 +510,9 @@ def start(self):
         self._thread.start()
 
     def stop(self):
-        self._thread.join()
-        self._thread = None
+        if self._thread is not None:
+            self._thread.join()
+            self._thread = None
 
     def get_connect_address(self):
         return self._socket.get_connect_address()
diff --git a/lldb/packages/Python/lldbsuite/test/lldbgdbproxy.py b/lldb/packages/Python/lldbsuite/test/lldbgdbproxy.py
new file mode 100644
index 00000000000000..a84c80f155a0a4
--- /dev/null
+++ b/lldb/packages/Python/lldbsuite/test/lldbgdbproxy.py
@@ -0,0 +1,175 @@
+import logging
+import os
+import os.path
+import random
+
+import lldb
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.gdbclientutils import *
+import lldbgdbserverutils
+from lldbsuite.support import seven
+
+
+class GDBProxyTestBase(TestBase):
+    """
+    Base class for gdbserver proxy tests.
+
+    This class will setup and start a mock GDB server for the test to use.
+    It pases through requests to a regular lldb-server/debugserver and
+    forwards replies back to the LLDB under test.
+    """
+
+    """The gdbserver that we implement."""
+    server = None
+    """The inner lldb-server/debugserver process that we proxy requests into."""
+    monitor_server = None
+    monitor_sock = None
+
+    server_socket_class = TCPServerSocket
+
+    DEFAULT_TIMEOUT = 20 * (10 if ("ASAN_OPTIONS" in os.environ) else 1)
+
+    _verbose_log_handler = None
+    _log_formatter = logging.Formatter(fmt="%(asctime)-15s %(levelname)-8s %(message)s")
+
+    def setUpBaseLogging(self):
+        self.logger = logging.getLogger(__name__)
+
+        self.logger.propagate = False
+        self.logger.setLevel(logging.DEBUG)
+
+        # log all warnings to stderr
+        handler = logging.StreamHandler()
+        handler.setLevel(logging.WARNING)
+        handler.setFormatter(self._log_formatter)
+        self.logger.addHandler(handler)
+
+    def setUp(self):
+        TestBase.setUp(self)
+
+        self.setUpBaseLogging()
+
+        if self.isVerboseLoggingRequested():
+            # If requested, full logs go to a log file
+            log_file_name = self.getLogBasenameForCurrentTest() + "-proxy.log"
+            self._verbose_log_handler = logging.FileHandler(log_file_name)
+            self._verbose_log_handler.setFormatter(self._log_formatter)
+            self._verbose_log_handler.setLevel(logging.DEBUG)
+            self.logger.addHandler(self._verbose_log_handler)
+
+        if lldbplatformutil.getPlatform() == "macosx":
+            self.debug_monitor_exe = lldbgdbserverutils.get_debugserver_exe()
+            self.debug_monitor_extra_args = []
+        else:
+            self.debug_monitor_exe = lldbgdbserverutils.get_lldb_server_exe()
+            self.debug_monitor_extra_args = ["gdbserver"]
+        self.assertIsNotNone(self.debug_monitor_exe)
+
+        self.server = MockGDBServer(self.server_socket_class())
+        self.server.responder = self
+
+    def tearDown(self):
+        # TestBase.tearDown will kill the process, but we need to kill it early
+        # so its client connection closes and we can stop the server before
+        # finally calling the base tearDown.
+        if self.process() is not None:
+            self.process().Kill()
+        self.server.stop()
+
+        self.logger.removeHandler(self._verbose_log_handler)
+        self._verbose_log_handler = None
+
+        TestBase.tearDown(self)
+
+    def isVerboseLoggingRequested(self):
+        # We will report our detailed logs if the user requested that the "gdb-remote" channel is
+        # logged.
+        return any(("gdb-remote" in channel) for channel in lldbtest_config.channels)
+
+    def connect(self, target):
+        """
+        Create a process by connecting to the mock GDB server.
+        """
+        self.prep_debug_monitor_and_inferior()
+        self.server.start()
+
+        listener = self.dbg.GetListener()
+        error = lldb.SBError()
+        process = target.ConnectRemote(
+            listener, self.server.get_connect_url(), "gdb-remote", error
+        )
+        self.assertTrue(error.Success(), error.description)
+        self.assertTrue(process, PROCESS_IS_VALID)
+        return process
+
+    def prep_debug_monitor_and_inferior(self):
+        inferior_exe_path = self.getBuildArtifact("a.out")
+        self.connect_to_debug_monitor([inferior_exe_path])
+        self.assertIsNotNone(self.monitor_server)
+        self.initial_handshake()
+
+    def initial_handshake(self):
+        self.monitor_server.send_packet(seven.bitcast_to_bytes("+"))
+        reply = seven.bitcast_to_string(self.monitor_server.get_normal_packet())
+        self.assertEqual(reply, "+")
+        self.monitor_server.send_packet(seven.bitcast_to_bytes("QStartNoAckMode"))
+        reply = seven.bitcast_to_string(self.monitor_server.get_normal_packet())
+        self.assertEqual(reply, "+")
+        reply = seven.bitcast_to_string(self.monitor_server.get_normal_packet())
+        self.assertEqual(reply, "OK")
+        self.monitor_server.set_validate_checksums(False)
+        self.monitor_server.send_packet(seven.bitcast_to_bytes("+"))
+        reply = seven.bitcast_to_string(self.monitor_server.get_normal_packet())
+        self.assertEqual(reply, "+")
+
+    def get_debug_monitor_command_line_args(self, connect_address, launch_args):
+        return (
+            self.debug_monitor_extra_args
+            + ["--reverse-connect", connect_address]
+            + launch_args
+        )
+
+    def launch_debug_monitor(self, launch_args):
+        family, type, proto, _, addr = socket.getaddrinfo(
+            "localhost", 0, proto=socket.IPPROTO_TCP
+        )[0]
+        sock = socket.socket(family, type, proto)
+        sock.settimeout(self.DEFAULT_TIMEOUT)
+        sock.bind(addr)
+        sock.listen(1)
+        addr = sock.getsockname()
+        connect_address = "[{}]:{}".format(*addr)
+
+        commandline_args = self.get_debug_monitor_command_line_args(
+            connect_address, launch_args
+        )
+
+        # Start the server.
+        self.logger.info(f"Spawning monitor {commandline_args}")
+        monitor_process = self.spawnSubprocess(
+            self.debug_monitor_exe, commandline_args, install_remote=False
+        )
+        self.assertIsNotNone(monitor_process)
+
+        self.monitor_sock = sock.accept()[0]
+        self.monitor_sock.settimeout(self.DEFAULT_TIMEOUT)
+        return monitor_process
+
+    def connect_to_debug_monitor(self, launch_args):
+        monitor_process = self.launch_debug_monitor(launch_args)
+        # Turn off checksum validation because debugserver does not produce
+        # correct checksums.
+        self.monitor_server = lldbgdbserverutils.Server(
+            self.monitor_sock, monitor_process
+        )
+
+    def respond(self, packet):
+        """Subclasses can override this to change how packets are handled."""
+        return self.pass_through(packet)
+
+    def pass_through(self, packet):
+        self.logger.info(f"Sending packet {packet}")
+        self.monitor_server.send_packet(seven.bitcast_to_bytes(packet))
+        reply = seven.bitcast_to_string(self.monitor_server.get_normal_packet())
+        self.logger.info(f"Received reply {reply}")
+        return reply
diff --git a/lldb/packages/Python/lldbsuite/test/lldbreverse.py b/lldb/packages/Python/lldbsuite/test/lldbreverse.py
new file mode 100644
index 00000000000000..18a8ee062015cc
--- /dev/null
+++ b/lldb/packages/Python/lldbsuite/test/lldbreverse.py
@@ -0,0 +1,528 @@
+import os
+import os.path
+import lldb
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.gdbclientutils import *
+from lldbsuite.test.lldbgdbproxy import *
+import lldbgdbserverutils
+import re
+
+
+class ThreadSnapshot:
+    def __init__(self, thread_id, registers):
+        self.thread_id = thread_id
+        self.registers = registers
+
+
+class MemoryBlockSnapshot:
+    def __init__(self, address, data):
+        self.address = address
+        self.data = data
+
+
+class StateSnapshot:
+    def __init__(self, thread_snapshots, memory):
+        self.thread_snapshots = thread_snapshots
+        self.memory = memory
+        self.thread_id = None
+
+
+class RegisterInfo:
+    def __init__(self, lldb_index, bitsize, little_endian):
+        self.lldb_index = lldb_index
+        self.bitsize = bitsize
+        self.little_endian = little_endian
+
+
+BELOW_STACK_POINTER = 16384
+ABOVE_STACK_POINTER = 4096
+
+BLOCK_SIZE = 1024
+
+SOFTWARE_BREAKPOINTS = 0
+HARDWARE_BREAKPOINTS = 1
+WRITE_WATCHPOINTS = 2
+
+
+class ReverseTestBase(GDBProxyTestBase):
+    """
+    Base class for tests that need reverse execution.
+
+    This class uses a gdbserver proxy to add very limited reverse-
+    execution capability to lldb-server/debugserver for testing
+    purposes only.
+
+    To use this class, run the inferior forward until some stopping point.
+    Then call `start_recording()` and execute forward again until reaching
+    a software breakpoint; this class records the state before each execution executes.
+    At that point, the server will accept "bc" and "bs" packets to step
+    backwards through the state.
+    When executing during recording, we only allow single-step and continue without
+    delivering a signal, and only software breakpoint stops are allowed.
+
+    We assume that while recording is enabled, the only effects of instructions
+    are on general-purpose registers (read/written by the 'g' and 'G' packets)
+    and on memory bytes between [SP - BELOW_STACK_POINTER, SP + ABOVE_STACK_POINTER).
+    """
+
+    NO_DEBUG_INFO_TESTCASE = True
+
+    """
+    A list of StateSnapshots in time order.
+
+    There is one snapshot per single-stepped instruction,
+    representing the state before that instruction was
+    executed. The last snapshot in the list is the
+    snapshot before the last instruction was executed.
+    This is an undo log; we snapshot a superset of the state that may have
+    been changed by the instruction's execution.
+    """
+    snapshots = None
+    recording_enabled = False
+
+    breakpoints = None
+
+    pc_register_info = None
+    sp_register_info = None
+    general_purpose_register_info = None
+
+    def __init__(self, *args, **kwargs):
+        GDBProxyTestBase.__init__(self, *args, **kwargs)
+        self.breakpoints = [set(), set(), set(), set(), set()]
+
+    def respond(self, packet):
+        if not packet:
+            raise ValueError("Invalid empty packet")
+        if packet == self.server.PACKET_INTERRUPT:
+            # Don't send a response. We'll just run to completion.
+            return []
+        if self.is_command(pac...
[truncated]

@DavidSpickett
Copy link
Collaborator Author

I also noticed that if you don't save memory at all, the breakpoint tests pass. Which is surprising, but ok given that the test doesn't check any memory values, and watchpoint tests do fail if you do not save memory.

self.sp_register_info, thread_snapshot.registers
)

# The memory above or below the stack pointer may be mapped, but not
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I was thinking straight, I'd have made the changes their own commit, but too late for that now.

This is part of them.

result = (result << 8) + byte
return result

def get_memory_region_info(self, addr):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second part of the fix.

@labath
Copy link
Collaborator

labath commented Jan 22, 2025

@rocallahan

The fix makes sense, but I think we still need to figure out what's the deal with green dragon. I can check out the arm64 watchpoint failure (eventually), but we might need help with the x86_64 failures. Judging by the error message, I'd say that the x86 debugserver reports some extra registers, which then turn out to be unreadable (@jasonmolenda does that sound plausible?). Maybe we should skip unreadable registers?

@rocallahan
Copy link
Contributor

I also noticed that if you don't save memory at all, the breakpoint tests pass. Which is surprising, but ok given that the test doesn't check any memory values, and watchpoint tests do fail if you do not save memory.

Yeah, this is why I think it was a good idea to add a test that actually needs the memory save/restore. Even though it caused the PR to bounce.

@rocallahan
Copy link
Contributor

@DavidSpickett this is your branch now ... can you add @skipIfRemote to the tests?

@rocallahan
Copy link
Contributor

Probably should change raise ValueError("Can't read register") to also log what register we couldn't read, and then rerun the tests on Mac x86-64.

@rocallahan
Copy link
Contributor

Another, simpler way to fix the memory read problem might be to just tolerate failed memory writes --- log and continue.

And probably we should do this for register writes, too.

The theory being that if the debugger can't write to the memory or register, then hopefully either that data can't change at all, or if it can, tests won't depend on it.

@DavidSpickett
Copy link
Collaborator Author

can you add @skipIfRemote to the tests?

Done

Another, simpler way to fix the memory read problem might be to just tolerate failed memory writes --- log and continue.

Log and continue is liable to be "triager forgets to look at logs, wastes days, disables the test and forgets". I think we're pretty safe to improve the error messages and raise them as soon as possible here.

If we get a pile of reports about spurious failures then sure, let's log it, but stick to erroring as long as we can.

I have improved the error messages for both memory and registers.

@DavidSpickett
Copy link
Collaborator Author

So this is ready for anyone who is able to test on Mac x86_64.

@labath
Copy link
Collaborator

labath commented Jan 24, 2025

I ran this on an arm mac, and also on x86-via-rosetta mac (which isn't quite the same as native, but it's the best I have). I did find one problem, which is that the packet forwarder misinterprets a response to the x packet which begins with an O as an "output" packet. I guess it needs to be taught that these can only come while the process is running. I'm still thinking about the best way to achieve that.

This change is definitely stress-testing the python packet parsing code as well. Before this, it was just handling a very limited about of traffic (just the packets that are explicitly sent by the lldb-server tests), but now it basically needs to forward entire lldb traffic.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After much deliberation, I came to the conclusion that it's best to just work around this problem, given that we're going to be changing how the x packet works (and removing the ambiguity) anyway.

This usage technically was not ambiguous because the output packet can only be sent when the target is running (so it can't be confused with an x response), but the test code was not tracking the run state of the server. I tried teaching it to do that, but wasn't particularly happy with the result, and given the x change, the added complication would not be necessary anyway. For a time I also wanted to wait until the servers start sending the new x response, but now it looks like that may still take quite some time.

TL;DR: LGTM with the inline suggestion applied

@rocallahan
Copy link
Contributor

@DavidSpickett can you apply that change to your branch since this is your PR now?

@labath does this pass on x86-via-Rosetta Mac? Should we retry merging with this change?

Thanks!

Co-authored-by: Pavel Labath <pavel@labath.sk>
@DavidSpickett
Copy link
Collaborator Author

@labath you can append to the PR description if you want to explain the bit I committed from you about the x packet.

@labath
Copy link
Collaborator

labath commented Jan 30, 2025

Done.

@labath
Copy link
Collaborator

labath commented Jan 30, 2025

@labath does this pass on x86-via-Rosetta Mac? Should we retry merging with this change?

Yes, it passes on Rosetta. I think we're ready for another try :)

@DavidSpickett
Copy link
Collaborator Author

My inbox is ready :)

@DavidSpickett DavidSpickett merged commit 0caba6c into llvm:main Jan 30, 2025
7 checks passed
@DavidSpickett DavidSpickett deleted the lldb-reverse branch January 30, 2025 14:03
@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Jan 30, 2025

Arm and AArch64 Linux were fine (aside from a flaky test tools/lldb-dap/attach/TestDAP_attachByPortNum.py on the latter).

Windows on Arm is complaining, I'm investigating.

@DavidSpickett
Copy link
Collaborator Author

The easy one to fix is:

Traceback (most recent call last):

  File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\reverse-execution\TestReverseContinueNotSupported.py", line 25, in test_reverse_continue_not_supported

    self.assertFailure(

  File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\lldbtest.py", line 2589, in assertFailure

    self.assertEqual(error, error_str, msg)

AssertionError: 'error: windows does not support reverse execution of processes' != 'error: gdb-remote does not support reverse execution of processes'

- error: windows does not support reverse execution of processes

?        ^^^  ^^

+ error: gdb-remote does not support reverse execution of processes

?        ^ +++++ ^^

The odd one is:

ERROR: test_continue_preserves_direction (TestReverseContinueBreakpoints.TestReverseContinueBreakpoints.test_continue_preserves_direction)

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\reverse-execution\TestReverseContinueBreakpoints.py", line 95, in test_continue_preserves_direction

    self.continue_preserves_direction_internal(async_mode=False)

  File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\reverse-execution\TestReverseContinueBreakpoints.py", line 102, in continue_preserves_direction_internal

    target, process, initial_threads = self.setup_recording(async_mode)

                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\reverse-execution\TestReverseContinueBreakpoints.py", line 140, in setup_recording

    self.assertEqual(len(initial_threads), 1)

                     ^^^^^^^^^^^^^^^^^^^^

TypeError: object of type 'NoneType' has no len()

I will revert this and reproduce it tomorrow.

DavidSpickett added a commit that referenced this pull request Jan 30, 2025
…123906)"" (#125091)

Reverts #123945

Has failed on the Windows on Arm buildbot:
https://lab.llvm.org/buildbot/#/builders/141/builds/5865
```
********************
Unresolved Tests (2):
  lldb-api :: functionalities/reverse-execution/TestReverseContinueBreakpoints.py
  lldb-api :: functionalities/reverse-execution/TestReverseContinueWatchpoints.py
********************
Failed Tests (1):
  lldb-api :: functionalities/reverse-execution/TestReverseContinueNotSupported.py
```
Reverting while I reproduce locally.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 30, 2025
…-continue" (#123906)"" (#125091)

Reverts llvm/llvm-project#123945

Has failed on the Windows on Arm buildbot:
https://lab.llvm.org/buildbot/#/builders/141/builds/5865
```
********************
Unresolved Tests (2):
  lldb-api :: functionalities/reverse-execution/TestReverseContinueBreakpoints.py
  lldb-api :: functionalities/reverse-execution/TestReverseContinueWatchpoints.py
********************
Failed Tests (1):
  lldb-api :: functionalities/reverse-execution/TestReverseContinueNotSupported.py
```
Reverting while I reproduce locally.
@DavidSpickett
Copy link
Collaborator Author

Back in again after fixing Windows testing - #125242.

@DavidSpickett
Copy link
Collaborator Author

Windows on Arm passed, Arm and AArch64 Linux passed. No complaints so far.

@adrian-prantl
Copy link
Collaborator

@adrian-prantl
Copy link
Collaborator

I'm going to revert this for now, let me know if you need help understanding the failure cause!

@labath
Copy link
Collaborator

labath commented Feb 3, 2025

I think we do, because I've tested this on my (arm) macbook, and it worked fine there.

@rocallahan
Copy link
Contributor

@adrian-prantl can we have that help please?
Also, is that link supposed to be accessible to the public? Because it's 404 for me.

@labath
Copy link
Collaborator

labath commented Feb 20, 2025

@adrian-prantl, any chance you or someone could take a look at this? The problem really appears to be specific to that buildbot (my guess: something about the debugserver version on that bot), as I've run the tests locally both with arm64 and x86_64-via-rosetta, both with the system and locally build debug server, and they all work fine.

The only other thing we can do is disable those tests on macs, but I'd really like to avoid that since that means the reverse debugging code path will get zero coverage on a mac.

@adrian-prantl
Copy link
Collaborator

adrian-prantl commented Feb 20, 2025 via email

@jasonmolenda
Copy link
Collaborator

From a quick check of the logs, green dragon seems to be set up to build debugserver, too.

greendragon probably does build debugserver, but tests use Xcode's debugserver for proper codesigning, not the built debugserver.

@adrian-prantl
Copy link
Collaborator

greendragon probably does build debugserver, but tests use Xcode's debugserver for proper codesigning, not the built debugserver.

Are you sure? My recollection is that we went out of our way to make code signing work on those bots.

@adrian-prantl
Copy link
Collaborator

adrian-prantl commented Feb 20, 2025

In any case, I think it might be useful to reland the patch once, so we can inspect the failure log on the bot, since the bot logs have been recycled since the last time this landed.

@labath
Copy link
Collaborator

labath commented Feb 21, 2025

Okay, I've created a new version of the patch here. Would you like to merge it at your convenience? There was a small merge conflict with Jason's breakpoint detection patch, but I think I've was able to resolve that correctly. I've added some debugging output to print the packet log in the "TraceOn" mode. If your bot doesn't run in this mode, you may want to change that to dumping the logs unconditionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants