Skip to content

Commit 15b1d7d

Browse files
[lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit (llvm#169630)
Currently, UnwindAssemblyInstEmulation visits instructions in the order in which they appear in a function. This commit makes an NFCI change to UnwindAssemblyInstEmulation so that it follows the function's CFG: 1. The first instruction is enqueued. 2. While the queue is not empty: 2.1 Visit the instruction in the *back* queue to compute the new unwind state. 2.2 Push(+) the next instruction to the *back* of the queue. 2.3 If the instruction is a forward branch with a known branch target, push(+) the destination instruction to the *front* of the queue. (+) Only push if this instruction hasn't been enqueued before. (+) When pushing an instruction, the current unwind state is attached to it. Note that: * the "next instruction" is pushed to the *back* of the queue, * a branch target is pushed to the *front* of the queue, and * we always dequeue from the *back* of the queue. This means that consecutive instructions are visited one after the other; this is important to support "conditional blocks" [1] of instructions (see the line with "if last_condition != new_condition"). This is arguably a very Thumb specific thing, so maybe it shouldn't be in the generic algorithm; that said, it is already in the code, so we have to support it. The main reason this patch is NFCI and not NFC is that, now, the destination of a forward branch is visited in a slightly different moment than before. This should not cause any changes in output, as if a branch destination is reachable through two different paths, any well behaved compiler will generate the same unwind state in both paths. The motivation for this patch is to change step 2.2 so that it _only_ pushes the next instruction if the current instruction is not an unconditional branch / return, and to change step 2.3 so that backwards branches are also allowed, fixing the bug described by [2]. [1]: https://developer.arm.com/documentation/dui0473/m/arm-and-thumb-instructions/it [2]: llvm#168398 Part of a sequence of PRs: [lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit llvm#169630 [lldb][NFC] Rename forward_branch_offset to branch_offset in UnwindAssemblyInstEmulation llvm#169631 [lldb] Add DisassemblerLLVMC::IsBarrier API llvm#169632 [lldb] Handle backwards branches in UnwindAssemblyInstEmulation llvm#169633 commit-id:dce6b515 (cherry picked from commit 5a32fd3)
1 parent cf476a9 commit 15b1d7d

File tree

1 file changed

+38
-14
lines changed

1 file changed

+38
-14
lines changed

lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp

Lines changed: 38 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@
2525
#include "lldb/Utility/Log.h"
2626
#include "lldb/Utility/Status.h"
2727
#include "lldb/Utility/StreamString.h"
28+
#include "llvm/ADT/SmallSet.h"
29+
#include <deque>
2830

2931
using namespace lldb;
3032
using namespace lldb_private;
@@ -150,29 +152,38 @@ bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
150152
EmulateInstruction::InstructionCondition last_condition =
151153
EmulateInstruction::UnconditionalCondition;
152154

153-
for (const InstructionSP &inst : inst_list.Instructions()) {
154-
if (!inst)
155-
continue;
156-
DumpInstToLog(log, *inst, inst_list);
155+
std::deque<std::size_t> to_visit = {0};
156+
llvm::SmallSet<std::size_t, 0> enqueued = {0};
157+
158+
// Instructions reachable through jumps are inserted on the front.
159+
// The next instruction is inserted on the back.
160+
// Pop from the back to ensure non-branching instructions are visited
161+
// sequentially.
162+
while (!to_visit.empty()) {
163+
const std::size_t current_index = to_visit.back();
164+
Instruction &inst = *inst_list.GetInstructionAtIndex(current_index);
165+
to_visit.pop_back();
166+
DumpInstToLog(log, inst, inst_list);
157167

158168
m_curr_row_modified = false;
159169
m_forward_branch_offset = 0;
160170

161171
lldb::addr_t current_offset =
162-
inst->GetAddress().GetFileAddress() - base_addr;
172+
inst.GetAddress().GetFileAddress() - base_addr;
163173
auto it = saved_unwind_states.upper_bound(current_offset);
164174
assert(it != saved_unwind_states.begin() &&
165175
"Unwind row for the function entry missing");
166176
--it; // Move it to the row corresponding to the current offset
167177

168-
// If the offset of m_state.row doesn't match with the offset we see in
169-
// saved_unwind_states then we have to update current unwind state to
170-
// the saved values. It is happening after we processed an epilogue and a
171-
// return to caller instruction.
178+
// When state is forwarded through a branch, the offset of m_state.row is
179+
// different from the offset available in saved_unwind_states. Use the
180+
// forwarded state in this case, as the previous instruction may have been
181+
// an unconditional jump.
182+
// FIXME: this assignment can always be done unconditionally.
172183
if (it->second.row.GetOffset() != m_state.row.GetOffset())
173184
m_state = it->second;
174185

175-
m_inst_emulator_up->SetInstruction(inst->GetOpcode(), inst->GetAddress(),
186+
m_inst_emulator_up->SetInstruction(inst.GetOpcode(), inst.GetAddress(),
176187
nullptr);
177188
const EmulateInstruction::InstructionCondition new_condition =
178189
m_inst_emulator_up->GetInstructionCondition();
@@ -199,26 +210,39 @@ bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
199210

200211
// If the current instruction is a branch forward then save the current
201212
// CFI information for the offset where we are branching.
213+
Address branch_address = inst.GetAddress();
214+
branch_address.Slide(m_forward_branch_offset);
202215
if (m_forward_branch_offset != 0 &&
203-
range.ContainsFileAddress(inst->GetAddress().GetFileAddress() +
204-
m_forward_branch_offset)) {
216+
range.ContainsFileAddress(branch_address.GetFileAddress())) {
205217
if (auto [it, inserted] = saved_unwind_states.emplace(
206218
current_offset + m_forward_branch_offset, m_state);
207-
inserted)
219+
inserted) {
208220
it->second.row.SetOffset(current_offset + m_forward_branch_offset);
221+
if (std::size_t dest_instr_index =
222+
inst_list.GetIndexOfInstructionAtAddress(branch_address);
223+
dest_instr_index < inst_list.GetSize()) {
224+
to_visit.push_front(dest_instr_index);
225+
enqueued.insert(dest_instr_index);
226+
}
227+
}
209228
}
210229

211230
// Were there any changes to the CFI while evaluating this instruction?
212231
if (m_curr_row_modified) {
213232
// Save the modified row if we don't already have a CFI row in the
214233
// current address
215234
const lldb::addr_t next_inst_offset =
216-
current_offset + inst->GetOpcode().GetByteSize();
235+
current_offset + inst.GetOpcode().GetByteSize();
217236
if (saved_unwind_states.count(next_inst_offset) == 0) {
218237
m_state.row.SetOffset(next_inst_offset);
219238
saved_unwind_states.emplace(next_inst_offset, m_state);
220239
}
221240
}
241+
242+
const size_t next_idx = current_index + 1;
243+
const bool never_enqueued = enqueued.insert(next_idx).second;
244+
if (never_enqueued && next_idx < inst_list.GetSize())
245+
to_visit.push_back(next_idx);
222246
}
223247

224248
for (auto &[_, state] : saved_unwind_states)

0 commit comments

Comments
 (0)