Description
When trying to debug some Python code that used z3 (for solving logic puzzles), I found that the set of clauses in problem_orig_python.txt has unusual behavior; when added to a solver which is then checked, it returns sat and gives a model, but the model fails to satisfy one of the clauses. In particular, there is a clause:
Or(And(0_11_N, 0_11_E, Not(0_11_S), Not(0_11_W)),
And(0_11_E, 0_11_S, Not(0_11_W), Not(0_11_N)),
And(0_11_S, 0_11_W, Not(0_11_N), Not(0_11_E)),
And(0_11_W, 0_11_N, Not(0_11_E), Not(0_11_S)),
And(Not(0_11_N), Not(0_11_E), Not(0_11_S), Not(0_11_W)))
And in the model, all four of 0_11_N, 0_11_E, 0_11_S, 0_11_W are true. (For what it's worth, there's a solution satisfying all the clauses in which all four of these are false. Adding an additional constraint setting any of them to false gives this solution for me.)
(In case it helps in debugging, all the variables in this example are booleans and (I think) none of the clauses are far more complicated than the violated clause.)
Since this example has almost five thousand clauses, I tried to reduce it, and was able to get an example with 814 clauses, in problem_reduced_python.txt. Here the details are different but the general issue is the same; there are two violated clauses:
[Or(And(8_0_N, 8_0_E, Not(8_0_S), Not(8_0_W)),
And(8_0_E, 8_0_S, Not(8_0_W), Not(8_0_N)),
And(8_0_S, 8_0_W, Not(8_0_N), Not(8_0_E)),
And(8_0_W, 8_0_N, Not(8_0_E), Not(8_0_S)),
And(Not(8_0_N), Not(8_0_E), Not(8_0_S), Not(8_0_W))), Or(Not(8_0_W), 8_0_N, Not(8_-1_S))]
8_0_W and 8_-1_S are true; 8_0_N, 8_0_E, and 8_0_S are false.
In case it helps, I was able to (via the code in the code block at the bottom; note that this code uses eval, so you probably don't want to run it yourself) transform these examples into SMT2 (attached as text files, because GitHub wouldn't let me attach them as smt2). The minus signs in some variable names caused errors, so I renamed the variables. The equivalent issues are, for the original version:
(assert
(or (and AEA AEB (not AED) (not AEC)) (and AEB AED (not AEC) (not AEA)) (and AED AEC (not AEA) (not AEB)) (and AEC AEA (not AEB) (not AED)) (and (not AEA) (not AEB) (not AED) (not AEC))))
All four of AEA, AEB, AEC, AED are true in the model, which violates this constraint.
And for the reduced version:
(assert
(or (and ATO ATP (not ATR) (not ATQ)) (and ATP ATR (not ATQ) (not ATO)) (and ATR ATQ (not ATO) (not ATP)) (and ATQ ATO (not ATP) (not ATR)) (and (not ATO) (not ATP) (not ATR) (not ATQ))))
(assert
(let (($x2292 (not ATN)))
(let (($x1418 (not ATQ)))
(or $x1418 ATO $x2292))))
ATN and ATQ are true in the model; ATO, ATP, and ATR are false in the model.
Python version: Python 3.13.3
Z3 version: Z3 version 4.15.0 - 64 bit
OS: MacOS, Sonoma 14.5
Let me know if there's any other information I can provide that would help in tracking this down (or if "z3 sometimes fails on inputs this large" is a known wontfix issue).
PS: SMT2 transformation code:
from z3 import Bool, And, Or, Not, Solver, is_true
import re
alpha = ''.join(chr(i) for i in range(65, 91))
d = dict(zip([f'\'{i}_{j}_{k}\'' for i in range(-1, 13) for j in range(-1, 13) for k in 'NEWS'], ['\'' + i + j + k + '\'' for i in alpha for j in alpha for k in alpha]))
def rep(x):
return re.sub('\'[-0-9]+_[-0-9]+_[NEWS]\'', lambda x: d[x.group(0)], x)
def write_z3(f1, f2):
with open(f1) as f:
clauses = eval(rep(f.read()))
print(len(clauses))
s = Solver()
s.add(clauses)
smt2_code = s.to_smt2()
with open(f2, 'w') as f:
f.write(smt2_code + '\n(get-model)')
assert str(s.check()) == 'sat'
print([i for i in clauses if not is_true(s.model().evaluate(i))])
# write_z3('problem_orig_python.txt', 'problem_orig_z3.smt2')
write_z3('problem_reduced_python.txt', 'problem_reduced_z3.smt2')