Description
While reviewing the trace functionality, I encountered a segmentation fault triggered by the following command:
./z3 -tr:reinit_clauses_bug examples/SMT-LIB2/bounded model checking/loop_unrolling.smt2
After minimizing the SMT-LIB input, I confirmed that the issue consistently occurs when a (push)
command is present:
(declare-const x Int)
(push)
(assert (= x 1))
(check-sat)
(pop)
(assert (= x 0))
(check-sat)
The segmentation fault occurs at the following code location:
Line 507 in 2714dc2
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x0000000100108fdc z3`ast::get_id(this=0x0000000000000000) const at ast.h:507:38
502 m_mark1_owner = 0;
503 m_mark2_owner = 0;
504 });
505 }
506 public:
-> 507 unsigned get_id() const { return m_id; }
508 unsigned get_ref_count() const { return m_ref_count; }
509 ast_kind get_kind() const { return static_cast<ast_kind>(m_kind); }
510 unsigned hash() const { return m_hash; }
511
512 #ifdef Z3DEBUG
Target 0: (z3) stopped.
The segmentation fault occurs because the code attempts to call get_id
on a nullptr
element from the bool_var2expr_map
.
This happens at the following line:
Lines 53 to 63 in 2714dc2
At this point, the index into bool_var2expr_map
is derived from the size of m_watches
.
Lines 139 to 156 in 2714dc2
Further investigation revealed that entries in bool_var2expr_map
are reset to nullptr when the solver backtracks after a (push)
, as part of the cleanup process for variables introduced after the (push)
.
This happens here:
z3/src/smt/smt_internalizer.cpp
Lines 961 to 977 in 2714dc2
However, while bool_var
entries — such as those in bool_var2expr_map
— are cleared appropriately, the associated watch list (m_watches
) is not updated accordingly.
As a result, the system later attempts to access bool_var2expr_map
using dangling indices derived from m_watches
, which may now point to nullptr
entries — ultimately causing a segmentation fault when get_id()
is called on a null pointer.
I believe simply adding a null check to suppress the crash during logging may not be the ideal solution.
Could you provide any guidance on how this issue might be addressed more appropriately?
I’m interested in fixing this issue. If there’s any context or related code I should look at first, I’d appreciate the guidance.