Skip to content

[RFC] CPROVER should use a reproducible resource limit rather than a time-out #4452

Open
@martin-cs

Description

@martin-cs

[ Prompted by https://github.com//pull/4449 ]

The SMT-LIB standard ( http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf Page 57 ) describes a reproducible resource limit option for solvers :

reproducible-resource-limit
default: 0
support: optional
If the solver supports this option, setting it to 0 disables it. Setting it a non-zero numeral n
will cause each subsequent check command to terminate within a bounded amount
of  time  dependent  on n.   The  internal  implementation  of  this  option  and  its  relation
to run  time or other  concrete resources can  be solver-specific.  However,  it  is required
that the invocation of a check command return unknown whenever the solver is unable
to determine the satisfiability of the formulas in the current context within the current
resource  limit.   Setting  a  higher  value  of n should  allow  more  resources  to  be  used,
which may cause the command to return sat or unsat instead of unknown. Furthermore,
the returned result should depend deterministically on n;  specifically, it should be the
same every time the solver is run with the same sequence of previous commands on the
same machine (and with an arbitrarily long external time out).  If the solver makes use
of randomization, it may require the :random-seed option to be set to a value other than
0 before :reproducible-resource-limit can be set to a positive value.

Note  that  this  option  is  not  intended  to  be  used  for  comparisons  between  different  solvers  since  they  can implement  it  differently.   Its  purpose  is  simply  to  guarantee  the  reproducibility  of  an  individual  solver’s results under the same external conditions.

This can be implemented as a simple count on learnt clauses, clause database cleans, etc. The relationship between a number and an actual run time will depend on the input, solver, hardware, etc. but should always be the same (up to the limits of hardware determinism).

I think we should use this for external solvers and implement something similar for the internal one because:

  1. It is deterministic. For many applications having analysis tools return different results for the same problem is unacceptable. It means we can reliably test time-out behaviour as well.

  2. If implemented as a number of counters / counts at multiple stages during the process, we should get finer-grained performance information and be able to more reliably track changes in performance without the normal noise.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions