Skip to content

[NFC] Separate UnwindTable from DebugFrame into a different type #142521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

amsen20
Copy link

@amsen20 amsen20 commented Jun 3, 2025

By separating the Unwind table into a different file, this functionality can be a part of the DWARF library with no dependency on MC, which makes it usable in the MC layer.

This is a continuation of PR#14520.

Amirhossein Pashaeehir added 2 commits June 3, 2025 00:06
…bject)

For creating new UnwindTable, two static methods was implemented inside it, to create an instance of it from a CIE or FDE.
This static methods are moved out of the class as a library functions.
Copy link

github-actions bot commented Jun 3, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jun 3, 2025

@llvm/pr-subscribers-debuginfo

Author: AmirHossein PashaeeHir (amsen20)

Changes

By separating the Unwind table into a different file, this functionality can be a part of the DWARF library with no dependency on MC, which make it usable in the MC layer.

Its use case is described in this RFC (TBA).

This is a continuation of PR#14520.


Patch is 83.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142521.diff

6 Files Affected:

  • (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h (+17-363)
  • (added) llvm/include/llvm/DebugInfo/DWARF/DWARFUnwindTable.h (+377)
  • (modified) llvm/lib/DebugInfo/DWARF/CMakeLists.txt (+1)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp (+7-486)
  • (added) llvm/lib/DebugInfo/DWARF/DWARFUnwindTable.cpp (+500)
  • (modified) llvm/unittests/DebugInfo/DWARF/DWARFDebugFrameTest.cpp (+28-28)
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
index b4b1e49e68a84..a3b94c2d438c7 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h
@@ -9,14 +9,13 @@
 #ifndef LLVM_DEBUGINFO_DWARF_DWARFDEBUGFRAME_H
 #define LLVM_DEBUGINFO_DWARF_DWARFDEBUGFRAME_H
 
-#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/iterator.h"
 #include "llvm/DebugInfo/DWARF/DWARFCFIProgram.h"
 #include "llvm/DebugInfo/DWARF/DWARFExpression.h"
+#include "llvm/DebugInfo/DWARF/DWARFUnwindTable.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Triple.h"
-#include <map>
 #include <memory>
 #include <vector>
 
@@ -29,374 +28,29 @@ struct DIDumpOptions;
 
 namespace dwarf {
 
-constexpr uint32_t InvalidRegisterNumber = UINT32_MAX;
-
-/// A class that represents a location for the Call Frame Address (CFA) or a
-/// register. This is decoded from the DWARF Call Frame Information
-/// instructions and put into an UnwindRow.
-class UnwindLocation {
-public:
-  enum Location {
-    /// Not specified.
-    Unspecified,
-    /// Register is not available and can't be recovered.
-    Undefined,
-    /// Register value is in the register, nothing needs to be done to unwind
-    /// it:
-    ///   reg = reg
-    Same,
-    /// Register is in or at the CFA plus an offset:
-    ///   reg = CFA + offset
-    ///   reg = defef(CFA + offset)
-    CFAPlusOffset,
-    /// Register or CFA is in or at a register plus offset, optionally in
-    /// an address space:
-    ///   reg = reg + offset [in addrspace]
-    ///   reg = deref(reg + offset [in addrspace])
-    RegPlusOffset,
-    /// Register or CFA value is in or at a value found by evaluating a DWARF
-    /// expression:
-    ///   reg = eval(dwarf_expr)
-    ///   reg = deref(eval(dwarf_expr))
-    DWARFExpr,
-    /// Value is a constant value contained in "Offset":
-    ///   reg = Offset
-    Constant,
-  };
-
-private:
-  Location Kind;   /// The type of the location that describes how to unwind it.
-  uint32_t RegNum; /// The register number for Kind == RegPlusOffset.
-  int32_t Offset;  /// The offset for Kind == CFAPlusOffset or RegPlusOffset.
-  std::optional<uint32_t> AddrSpace;   /// The address space for Kind ==
-                                       /// RegPlusOffset for CFA.
-  std::optional<DWARFExpression> Expr; /// The DWARF expression for Kind ==
-                                       /// DWARFExpression.
-  bool Dereference; /// If true, the resulting location must be dereferenced
-                    /// after the location value is computed.
-
-  // Constructors are private to force people to use the create static
-  // functions.
-  UnwindLocation(Location K)
-      : Kind(K), RegNum(InvalidRegisterNumber), Offset(0),
-        AddrSpace(std::nullopt), Dereference(false) {}
-
-  UnwindLocation(Location K, uint32_t Reg, int32_t Off,
-                 std::optional<uint32_t> AS, bool Deref)
-      : Kind(K), RegNum(Reg), Offset(Off), AddrSpace(AS), Dereference(Deref) {}
-
-  UnwindLocation(DWARFExpression E, bool Deref)
-      : Kind(DWARFExpr), RegNum(InvalidRegisterNumber), Offset(0), Expr(E),
-        Dereference(Deref) {}
-
-public:
-  /// Create a location whose rule is set to Unspecified. This means the
-  /// register value might be in the same register but it wasn't specified in
-  /// the unwind opcodes.
-  static UnwindLocation createUnspecified();
-  /// Create a location where the value is undefined and not available. This can
-  /// happen when a register is volatile and can't be recovered.
-  static UnwindLocation createUndefined();
-  /// Create a location where the value is known to be in the register itself.
-  static UnwindLocation createSame();
-  /// Create a location that is in (Deref == false) or at (Deref == true) the
-  /// CFA plus an offset. Most registers that are spilled onto the stack use
-  /// this rule. The rule for the register will use this rule and specify a
-  /// unique offset from the CFA with \a Deref set to true. This value will be
-  /// relative to a CFA value which is typically defined using the register
-  /// plus offset location. \see createRegisterPlusOffset(...) for more
-  /// information.
-  static UnwindLocation createIsCFAPlusOffset(int32_t Off);
-  static UnwindLocation createAtCFAPlusOffset(int32_t Off);
-  /// Create a location where the saved value is in (Deref == false) or at
-  /// (Deref == true) a regiser plus an offset and, optionally, in the specified
-  /// address space (used mostly for the CFA).
-  ///
-  /// The CFA is usually defined using this rule by using the stack pointer or
-  /// frame pointer as the register, with an offset that accounts for all
-  /// spilled registers and all local variables in a function, and Deref ==
-  /// false.
-  static UnwindLocation
-  createIsRegisterPlusOffset(uint32_t Reg, int32_t Off,
-                             std::optional<uint32_t> AddrSpace = std::nullopt);
-  static UnwindLocation
-  createAtRegisterPlusOffset(uint32_t Reg, int32_t Off,
-                             std::optional<uint32_t> AddrSpace = std::nullopt);
-  /// Create a location whose value is the result of evaluating a DWARF
-  /// expression. This allows complex expressions to be evaluated in order to
-  /// unwind a register or CFA value.
-  static UnwindLocation createIsDWARFExpression(DWARFExpression Expr);
-  static UnwindLocation createAtDWARFExpression(DWARFExpression Expr);
-  static UnwindLocation createIsConstant(int32_t Value);
-
-  Location getLocation() const { return Kind; }
-  uint32_t getRegister() const { return RegNum; }
-  int32_t getOffset() const { return Offset; }
-  uint32_t getAddressSpace() const {
-    assert(Kind == RegPlusOffset && AddrSpace);
-    return *AddrSpace;
-  }
-  int32_t getConstant() const { return Offset; }
-  /// Some opcodes will modify the CFA location's register only, so we need
-  /// to be able to modify the CFA register when evaluating DWARF Call Frame
-  /// Information opcodes.
-  void setRegister(uint32_t NewRegNum) { RegNum = NewRegNum; }
-  /// Some opcodes will modify the CFA location's offset only, so we need
-  /// to be able to modify the CFA offset when evaluating DWARF Call Frame
-  /// Information opcodes.
-  void setOffset(int32_t NewOffset) { Offset = NewOffset; }
-  /// Some opcodes modify a constant value and we need to be able to update
-  /// the constant value (DW_CFA_GNU_window_save which is also known as
-  // DW_CFA_AARCH64_negate_ra_state).
-  void setConstant(int32_t Value) { Offset = Value; }
-
-  std::optional<DWARFExpression> getDWARFExpressionBytes() const {
-    return Expr;
-  }
-  /// Dump a location expression as text and use the register information if
-  /// some is provided.
-  ///
-  /// \param OS the stream to use for output.
-  ///
-  /// \param MRI register information that helps emit register names insteead
-  /// of raw register numbers.
-  ///
-  /// \param IsEH true if the DWARF Call Frame Information is from .eh_frame
-  /// instead of from .debug_frame. This is needed for register number
-  /// conversion because some register numbers differ between the two sections
-  /// for certain architectures like x86.
-  void dump(raw_ostream &OS, DIDumpOptions DumpOpts) const;
-
-  bool operator==(const UnwindLocation &RHS) const;
-};
-
-raw_ostream &operator<<(raw_ostream &OS, const UnwindLocation &R);
+class CIE;
 
-/// A class that can track all registers with locations in a UnwindRow object.
+/// Create an UnwindTable from a Common Information Entry (CIE).
 ///
-/// Register locations use a map where the key is the register number and the
-/// the value is a UnwindLocation.
+/// \param Cie The Common Information Entry to extract the table from. The
+/// CFIProgram is retrieved from the \a Cie object and used to create the
+/// UnwindTable.
 ///
-/// The register maps are put into a class so that all register locations can
-/// be copied when parsing the unwind opcodes DW_CFA_remember_state and
-/// DW_CFA_restore_state.
-class RegisterLocations {
-  std::map<uint32_t, UnwindLocation> Locations;
-
-public:
-  /// Return the location for the register in \a RegNum if there is a location.
-  ///
-  /// \param RegNum the register number to find a location for.
-  ///
-  /// \returns A location if one is available for \a RegNum, or std::nullopt
-  /// otherwise.
-  std::optional<UnwindLocation> getRegisterLocation(uint32_t RegNum) const {
-    auto Pos = Locations.find(RegNum);
-    if (Pos == Locations.end())
-      return std::nullopt;
-    return Pos->second;
-  }
-
-  /// Set the location for the register in \a RegNum to \a Location.
-  ///
-  /// \param RegNum the register number to set the location for.
-  ///
-  /// \param Location the UnwindLocation that describes how to unwind the value.
-  void setRegisterLocation(uint32_t RegNum, const UnwindLocation &Location) {
-    Locations.erase(RegNum);
-    Locations.insert(std::make_pair(RegNum, Location));
-  }
-
-  /// Removes any rule for the register in \a RegNum.
-  ///
-  /// \param RegNum the register number to remove the location for.
-  void removeRegisterLocation(uint32_t RegNum) { Locations.erase(RegNum); }
-
-  /// Dump all registers + locations that are currently defined in this object.
-  ///
-  /// \param OS the stream to use for output.
-  ///
-  /// \param MRI register information that helps emit register names insteead
-  /// of raw register numbers.
-  ///
-  /// \param IsEH true if the DWARF Call Frame Information is from .eh_frame
-  /// instead of from .debug_frame. This is needed for register number
-  /// conversion because some register numbers differ between the two sections
-  /// for certain architectures like x86.
-  void dump(raw_ostream &OS, DIDumpOptions DumpOpts) const;
-
-  /// Returns true if we have any register locations in this object.
-  bool hasLocations() const { return !Locations.empty(); }
-
-  size_t size() const { return Locations.size(); }
-
-  bool operator==(const RegisterLocations &RHS) const {
-    return Locations == RHS.Locations;
-  }
-};
+/// \returns An error if the DWARF Call Frame Information opcodes have state
+/// machine errors, or a valid UnwindTable otherwise.
+Expected<UnwindTable> createUnwindTable(const CIE *Cie);
 
-raw_ostream &operator<<(raw_ostream &OS, const RegisterLocations &RL);
+class FDE;
 
-/// A class that represents a single row in the unwind table that is decoded by
-/// parsing the DWARF Call Frame Information opcodes.
+/// Create an UnwindTable from a Frame Descriptor Entry (FDE).
 ///
-/// The row consists of an optional address, the rule to unwind the CFA and all
-/// rules to unwind any registers. If the address doesn't have a value, this
-/// row represents the initial instructions for a CIE. If the address has a
-/// value the UnwindRow represents a row in the UnwindTable for a FDE. The
-/// address is the first address for which the CFA location and register rules
-/// are valid within a function.
+/// \param Fde The Frame Descriptor Entry to extract the table from. The
+/// CFIProgram is retrieved from the \a Fde object and used to create the
+/// UnwindTable.
 ///
-/// UnwindRow objects are created by parsing opcodes in the DWARF Call Frame
-/// Information and UnwindRow objects are lazily populated and pushed onto a
-/// stack in the UnwindTable when evaluating this state machine. Accessors are
-/// needed for the address, CFA value, and register locations as the opcodes
-/// encode a state machine that produces a sorted array of UnwindRow objects
-/// \see UnwindTable.
-class UnwindRow {
-  /// The address will be valid when parsing the instructions in a FDE. If
-  /// invalid, this object represents the initial instructions of a CIE.
-  std::optional<uint64_t> Address; ///< Address for row in FDE, invalid for CIE.
-  UnwindLocation CFAValue;    ///< How to unwind the Call Frame Address (CFA).
-  RegisterLocations RegLocs;  ///< How to unwind all registers in this list.
-
-public:
-  UnwindRow() : CFAValue(UnwindLocation::createUnspecified()) {}
-
-  /// Returns true if the address is valid in this object.
-  bool hasAddress() const { return Address.has_value(); }
-
-  /// Get the address for this row.
-  ///
-  /// Clients should only call this function after verifying it has a valid
-  /// address with a call to \see hasAddress().
-  uint64_t getAddress() const { return *Address; }
-
-  /// Set the address for this UnwindRow.
-  ///
-  /// The address represents the first address for which the CFAValue and
-  /// RegLocs are valid within a function.
-  void setAddress(uint64_t Addr) { Address = Addr; }
-
-  /// Offset the address for this UnwindRow.
-  ///
-  /// The address represents the first address for which the CFAValue and
-  /// RegLocs are valid within a function. Clients must ensure that this object
-  /// already has an address (\see hasAddress()) prior to calling this
-  /// function.
-  void slideAddress(uint64_t Offset) { *Address += Offset; }
-  UnwindLocation &getCFAValue() { return CFAValue; }
-  const UnwindLocation &getCFAValue() const { return CFAValue; }
-  RegisterLocations &getRegisterLocations() { return RegLocs; }
-  const RegisterLocations &getRegisterLocations() const { return RegLocs; }
-
-  /// Dump the UnwindRow to the stream.
-  ///
-  /// \param OS the stream to use for output.
-  ///
-  /// \param MRI register information that helps emit register names insteead
-  /// of raw register numbers.
-  ///
-  /// \param IsEH true if the DWARF Call Frame Information is from .eh_frame
-  /// instead of from .debug_frame. This is needed for register number
-  /// conversion because some register numbers differ between the two sections
-  /// for certain architectures like x86.
-  ///
-  /// \param IndentLevel specify the indent level as an integer. The UnwindRow
-  /// will be output to the stream preceded by 2 * IndentLevel number of spaces.
-  void dump(raw_ostream &OS, DIDumpOptions DumpOpts,
-            unsigned IndentLevel = 0) const;
-};
-
-raw_ostream &operator<<(raw_ostream &OS, const UnwindRow &Row);
-
-class CIE;
-class FDE;
-
-/// A class that contains all UnwindRow objects for an FDE or a single unwind
-/// row for a CIE. To unwind an address the rows, which are sorted by start
-/// address, can be searched to find the UnwindRow with the lowest starting
-/// address that is greater than or equal to the address that is being looked
-/// up.
-class UnwindTable {
-public:
-  using RowContainer = std::vector<UnwindRow>;
-  using iterator = RowContainer::iterator;
-  using const_iterator = RowContainer::const_iterator;
-
-  size_t size() const { return Rows.size(); }
-  iterator begin() { return Rows.begin(); }
-  const_iterator begin() const { return Rows.begin(); }
-  iterator end() { return Rows.end(); }
-  const_iterator end() const { return Rows.end(); }
-  const UnwindRow &operator[](size_t Index) const {
-    assert(Index < size());
-    return Rows[Index];
-  }
-
-  /// Dump the UnwindTable to the stream.
-  ///
-  /// \param OS the stream to use for output.
-  ///
-  /// \param MRI register information that helps emit register names insteead
-  /// of raw register numbers.
-  ///
-  /// \param IsEH true if the DWARF Call Frame Information is from .eh_frame
-  /// instead of from .debug_frame. This is needed for register number
-  /// conversion because some register numbers differ between the two sections
-  /// for certain architectures like x86.
-  ///
-  /// \param IndentLevel specify the indent level as an integer. The UnwindRow
-  /// will be output to the stream preceded by 2 * IndentLevel number of spaces.
-  void dump(raw_ostream &OS, DIDumpOptions DumpOpts,
-            unsigned IndentLevel = 0) const;
-
-  /// Create an UnwindTable from a Common Information Entry (CIE).
-  ///
-  /// \param Cie The Common Information Entry to extract the table from. The
-  /// CFIProgram is retrieved from the \a Cie object and used to create the
-  /// UnwindTable.
-  ///
-  /// \returns An error if the DWARF Call Frame Information opcodes have state
-  /// machine errors, or a valid UnwindTable otherwise.
-  static Expected<UnwindTable> create(const CIE *Cie);
-
-  /// Create an UnwindTable from a Frame Descriptor Entry (FDE).
-  ///
-  /// \param Fde The Frame Descriptor Entry to extract the table from. The
-  /// CFIProgram is retrieved from the \a Fde object and used to create the
-  /// UnwindTable.
-  ///
-  /// \returns An error if the DWARF Call Frame Information opcodes have state
-  /// machine errors, or a valid UnwindTable otherwise.
-  static Expected<UnwindTable> create(const FDE *Fde);
-
-private:
-  RowContainer Rows;
-  /// The end address when data is extracted from a FDE. This value will be
-  /// invalid when a UnwindTable is extracted from a CIE.
-  std::optional<uint64_t> EndAddress;
-
-  /// Parse the information in the CFIProgram and update the CurrRow object
-  /// that the state machine describes.
-  ///
-  /// This is an internal implementation that emulates the state machine
-  /// described in the DWARF Call Frame Information opcodes and will push
-  /// CurrRow onto the Rows container when needed.
-  ///
-  /// \param CFIP the CFI program that contains the opcodes from a CIE or FDE.
-  ///
-  /// \param CurrRow the current row to modify while parsing the state machine.
-  ///
-  /// \param InitialLocs If non-NULL, we are parsing a FDE and this contains
-  /// the initial register locations from the CIE. If NULL, then a CIE's
-  /// opcodes are being parsed and this is not needed. This is used for the
-  /// DW_CFA_restore and DW_CFA_restore_extended opcodes.
-  Error parseRows(const CFIProgram &CFIP, UnwindRow &CurrRow,
-                  const RegisterLocations *InitialLocs);
-};
-
-raw_ostream &operator<<(raw_ostream &OS, const UnwindTable &Rows);
+/// \returns An error if the DWARF Call Frame Information opcodes have state
+/// machine errors, or a valid UnwindTable otherwise.
+Expected<UnwindTable> createUnwindTable(const FDE *Fde);
 
 /// An entry in either debug_frame or eh_frame. This entry can be a CIE or an
 /// FDE.
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnwindTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnwindTable.h
new file mode 100644
index 0000000000000..7a36b1630e7e1
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnwindTable.h
@@ -0,0 +1,377 @@
+//===- DWARFUnwindTable.h ----------------------------------------*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_DEBUGINFO_DWARF_DWARFUNWINDTABLE_H
+#define LLVM_DEBUGINFO_DWARF_DWARFUNWINDTABLE_H
+
+#include "llvm/DebugInfo/DWARF/DWARFCFIProgram.h"
+#include "llvm/DebugInfo/DWARF/DWARFExpression.h"
+#include "llvm/Support/Error.h"
+#include <map>
+#include <vector>
+
+namespace llvm {
+
+namespace dwarf {
+constexpr uint32_t InvalidRegisterNumber = UINT32_MAX;
+
+/// A class that represents a location for the Call Frame Address (CFA) or a
+/// register. This is decoded from the DWARF Call Frame Information
+/// instructions and put into an UnwindRow.
+class UnwindLocation {
+public:
+  enum Location {
+    /// Not specified.
+    Unspecified,
+    /// Register is not available and can't be recovered.
+    Undefined,
+    /// Register value is in the register, nothing needs to be done to unwind
+    /// it:
+    ///   reg = reg
+    Same,
+    /// Register is in or at the CFA plus an offset:
+    ///   reg = CFA + offset
+    ///   reg = defef(CFA + offset)
+    CFAPlusOffset,
+    /// Register or CFA is in or at a register plus offset, optionally in
+    /// an address space:
+    ///   reg = reg + offset [in addrspace]
+    ///   reg = deref(reg + offset [in addrspace])
+    RegPlusOffset,
+    /// Register or CFA value is in or at a value found by evaluating a DWARF
+    /// expression:
+    ///   reg = eval(dwarf_expr)
+    ///   reg = deref(eval(dwarf_expr))
+    DWARFExpr,
+    /// Value is a constant value contained in "Offset":
+    ///   reg = Offset
+    Constant,
+  };
+
+private:
+  Location Kind;   /// The type of the location that describes ho...
[truncated]

@amsen20 amsen20 changed the title Unwind table separation [NFC] Separate UnwindTable from DebugFrame into a different type Jun 3, 2025
@petrhosek petrhosek requested review from dwblaikie and ilovepi June 4, 2025 18:47
@dwblaikie
Copy link
Collaborator

@Sterling-Augustine is this part of your DWARF expression parsing work? Or coincidentally overlapping with it?

@amsen20 where do you plan to put this code once it is refactored? Into MC itself, elsewhere, some new library?

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the other review, the C++ refactoring looks fine since its mostly mechanical, but this change is a decision code owners need to weigh in on. Fuller description of the change, rationale, and intended future work would go a long way to make the case for the refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants