forked from llvm/circt
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SMT] Add Dialect rationale and boilerplate
- Loading branch information
Showing
21 changed files
with
490 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
# SMT Dialect | ||
|
||
This dialect provides types and operations modeling the SMT (Satisfiability | ||
Modulo Theories) operations and datatypes commonly found in SMT-LIB and SMT | ||
solvers. | ||
|
||
[TOC] | ||
|
||
## Rationale | ||
|
||
This dialect aims to provide a unified interface for expressing SMT problems | ||
directly within MLIR, enabling seamless integration with other MLIR dialects and | ||
optimization passes. It models the [SMT-LIB standard | ||
2.6](https://smtlib.cs.uiowa.edu/), but may also include features of commonly | ||
used SMT solvers that are not part of the standard. In particular, the IR | ||
constructs are designed to enable more interactive communication with the solver | ||
(allowing if-statements, etc. to react on solver feedback). | ||
|
||
The SMT dialect is motivated by the following advantages over directly printing | ||
SMT-LIB or exporting to a solver: | ||
* Reuse MLIR's infrastructure: passes and pass managers to select different SMT | ||
encodings, operation builders and rewrite patterns to build SMT formulae (and | ||
hide the solver's API in a provided lowering pass), common textual format to | ||
share rather than dumping the solver's state, etc. | ||
* No need to add a link-dependency on SMT solvers to CIRCT (just provide the | ||
path to the library as an argument to the JIT runner or manually link the | ||
produced binary against it). | ||
* Using an SMT dialect as intermediary allows it to be mixed with concrete | ||
computations, the debug dialect, etc. This is complicated to achieve and | ||
reason about when building the external solver's state directly. | ||
* Enable easy addition of new backends | ||
* Have a common representation and infrastructure for all SMT related efforts, | ||
such that people don't have to build their own isolated tools. | ||
|
||
The dialect follows these design principles: | ||
* **Solver-independent**: don't model one particular solver's API | ||
* **Seemless interoperability** with other dialects. E.g., to allow using the | ||
debug dialect to back-propagate counter examples | ||
* **Small surface for errors**: try to keep the SMT dialect and its lowerings | ||
simple to avoid mistakes, implement optimizations defensively or prove them | ||
formally; since higher-level dialects will be lowered through the SMT dialect | ||
to construct formal proofs it is essential that this dialect does not | ||
introduce bugs | ||
* **Interactive**: the IR should be designed such that it can be interleaved | ||
with operations (from other dialects) that take the current feedback of the | ||
solver to steer the execution of further SMT operations. It shouldn't just | ||
model the very rigid SMT-LIB. | ||
* Don't heavily integrate the dialect itself with CIRCT to make potential | ||
upstreaming easy | ||
|
||
|
||
## Dialect Structure | ||
|
||
The SMT dialect is structured into multiple "sub-dialects", one for each of the | ||
following theories (this separation is also made clear in the prefix of | ||
operation and type names as indicated in parentheses): | ||
* Core boolean logic including quantifiers and solver interaction (`smt.*`) | ||
* Bit-vectors (`smt.bv.*`) | ||
* Arbitrary-precision integers (`smt.int.*`) | ||
* Arrays (`smt.array.*`) | ||
* Floating-point numbers (`smt.real.*`) | ||
|
||
Several operations in the core part (e.g., quantifiers, equality, etc.) allow | ||
operands of any SMT type (including bit-vectors, arrays, etc.). Therefore, all | ||
type and attribute declarations are part of the core. | ||
|
||
Certain arithmetic, bitwise, and comparison operations exist for multiple | ||
theories. For example, there exists an AND operation for booleans and one for | ||
bit-vectors, or there exists and ADD operation for integers and bit-vectors. In | ||
such cases, each "sub-dialect" defines its own operation specific to its | ||
datatypes. This simplifies the operations such that an optimization or | ||
conversion pass only using bit-vectors does not have to take care of other | ||
potentially supported datatypes. | ||
|
||
## Optimizations | ||
|
||
The primary purpose of the SMT dialect is not to optimize SMT formulae. However, | ||
SMT solvers can exhibit significant differences in runtime, even with slight | ||
changes in input. Improving solver performance by implementing rewrite patterns | ||
that slightly restructure the SMT formulae may be possible. | ||
|
||
Moreover, SMT solvers may differ in terms of built-in operators. If a solver | ||
lacks support for advanced operators, the problem can be simplified before | ||
passing it to the solver. | ||
|
||
## Backends | ||
|
||
Having an SMT dialect instead of directly interpreting the IR and building an | ||
SMT expression enables multiple different backends to be used in addition to the | ||
application of SMT dialect level optimizations that rewrite the formulae for | ||
faster and more predictable runtime performance of the solver backend (e.g., | ||
Z3). | ||
|
||
In the following, we outline the different backend lowerings and their | ||
advantages and disadvantages. | ||
|
||
### LLVM IR | ||
|
||
Lowering to LLVM IR that calls the C API of a given SMT solver is practical for | ||
a few reasons: | ||
* enables using LLVM JIT or compilation to a standalone binary | ||
* easy to mix with Debug dialect to report back nice counter examples | ||
* allows mixing concrete and symbolic executions (e.g., for dynamic BMC upper | ||
bounds, or more dynamic interaction with the solver such as extraction of | ||
multiple/all possible models) | ||
|
||
However, it is solver-dependent and more complicated to implement than an | ||
SMT-LIB printer. | ||
|
||
### SMT-LIB | ||
|
||
The SMT-LIB format is a standardized format supported by practically all SMT | ||
solvers and thus an ideal target to support as many solver backends as possible. | ||
However, | ||
* this format is quite static and does not allow easy interaction with the | ||
solver, in particular, it is not easily possible to make future asserts | ||
dependent on current solver outputs, | ||
* providing good counter-examples to the user would mean parsing the textual | ||
model output of the solver and mapping it to an CIRCT-internal datastructure. | ||
* it is impossible to mix symbolic and concrete executions, as well as debug | ||
constructs (see Debug Dialect). | ||
* it is impossible to just use the LLVM JIT compiler to directly get a result, | ||
but instead the external solver has to be called directly, either by adding a | ||
compile-time dependency, or using a shell call. | ||
|
||
### C/C++ | ||
|
||
A C/C++ exporter that produces code which calls the C/C++ API of a given solver | ||
could allow for easier debugging and allows to support solvers without C API | ||
without the restrictions of SMT-LIB. However, this means the JIT compilation | ||
flow would not be available. | ||
|
||
## Handling counter-examples | ||
|
||
SMT solvers check formulae for satisfiability. Typically, there are three kinds | ||
of output a solver may give: | ||
* **Satisfiable**: In this case a model (counter-example) can be provided by the | ||
solver. However, it may not be feasible to evaluate/interpret this model for a | ||
given SMT constant to get a constant value. This is, in particular, the case | ||
when the SMT encoding contains quantifiers which can lead to the model | ||
containing quantifiers as well. Solvers (e.g., Z3) usually don't evaluate | ||
quantifiers in models (even if they are closed). If constants can be | ||
evaluated, a counter example can be provided and back-propagated to the | ||
source-code, e.g., using the debug dialect. | ||
* **Unsatisfiable**: formal verification problems are typically encoded such | ||
that this output indicates correctness; a proof can be provided by the solver, | ||
but is often not needed | ||
* **Unknown**: there can be various reasons why the result is unknown; a common | ||
one is the use of symbolic functions to represent operators and encode, e.g., | ||
a LEC problem as a rewrite task of patterns of function applications. This is | ||
a frequent application and the unknown result is just treated like a | ||
satisfiable result without counter example. | ||
|
||
## Non-Goals | ||
|
||
* The SMT Dialect does not aim to include any operations or types that model | ||
verification constructs not specific to SMT, i.e., things that could also be | ||
lowered to other kind of verification systems such as inductive theorem | ||
provers (e.g., Lean4). | ||
|
||
## Operations | ||
|
||
[include "Dialects/SMTOps.md"] | ||
|
||
## Types | ||
|
||
[include "Dialects/SMTTypes.md"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Formal Verification Tooling | ||
|
||
Formally verifying hardware designs is a crucial step during the development | ||
process. Various techniques exist, such as logical equivalence checking, model | ||
checking, symbolic execution, etc. The preferred technique depends on the level | ||
of abstraction, the kind of properties to be verified, runtime limitations, etc. | ||
As a hardware compiler collection, CIRCT provides infrastructure to implement | ||
formal verification tooling and already comes with a few tools for common | ||
use-cases. This document provides an introduction to those tools and gives and | ||
overview over the underlying infrastructure for compiler engineers who want to | ||
use CIRCT to implement their custom verification tool. | ||
|
||
[TOC] | ||
|
||
## Logical Equivalence Checking | ||
|
||
The `circt-lec` tool takes one or two MLIR input files with operations of the | ||
HW, Comb, and Seq dialects and the names of two modules to check for | ||
equivalence. To get to know the exact CLI command type `circt-lec --help`. | ||
|
||
For example, the below IR contains two modules `@mulByTwo` and `@add`. Calling | ||
`circt-lec input.mlir -c1=mulByTwo -c2=add` will output `c1 == c2` since they | ||
are semantically equivalent. | ||
|
||
```mlir | ||
// input.mlir | ||
hw.module @mulByTwo(in %in: i32, out out: i32) { | ||
dbg.variable "in", %in : i32 | ||
%two = hw.constant 2 : i32 | ||
dbg.variable "two", %two : i32 | ||
%res = comb.mul %in, %two : i32 | ||
hw.output %res : i32 | ||
} | ||
hw.module @add(in %in: i32, out out: i32) { | ||
dbg.variable "in", %in : i32 | ||
%res = comb.add %in, %in : i32 | ||
hw.output %res : i32 | ||
} | ||
``` | ||
|
||
In the above example, the MLIR file is compiled to LLVM IR which is then passed | ||
to the LLVM JIT compiler to directly output the LEC result. Alternatively, there | ||
are command line flags to print the LEC problem in SMT-LIB, LLVM IR, or an | ||
object file that can be linked against the Z3 SMT solver to produce a standalone | ||
binary. | ||
|
||
## Bounded Model Checking | ||
|
||
TODO | ||
|
||
## Infrastructure Overview | ||
|
||
This section provides and overview over the relevant dialects and passes for | ||
building formal verification tools based on CIRCT. The | ||
[below diagram](/includes/img/smt_based_formal_verification_infra.svg) provides | ||
a visual overview. | ||
|
||
<p align="center"><img src="/includes/img/smt_based_formal_verification_infra.svg"/></p> |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
add_circt_dialect(SMT smt) | ||
add_circt_doc(SMTOps Dialects/SMTOps -gen-op-doc) | ||
add_circt_doc(SMTTypes Dialects/SMTTypes -gen-typedef-doc -dialect smt) | ||
|
||
set(LLVM_TARGET_DEFINITIONS SMT.td) | ||
mlir_tablegen(SMTEnums.h.inc -gen-enum-decls) | ||
mlir_tablegen(SMTEnums.cpp.inc -gen-enum-defs) | ||
add_public_tablegen_target(CIRCTSMTEnumsIncGen) | ||
add_dependencies(circt-headers CIRCTSMTEnumsIncGen) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
//===- SMT.td - SMT dialect definition ---------------------*- tablegen -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMT_TD | ||
#define CIRCT_DIALECT_SMT_SMT_TD | ||
|
||
include "mlir/IR/OpBase.td" | ||
|
||
include "circt/Dialect/SMT/SMTDialect.td" | ||
include "circt/Dialect/SMT/SMTTypes.td" | ||
include "circt/Dialect/SMT/SMTOps.td" | ||
|
||
#endif // CIRCT_DIALECT_SMT_SMT_TD |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
//===- SMTDialect.h - SMT dialect definition --------------------*- C++ -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMTDIALECT_H | ||
#define CIRCT_DIALECT_SMT_SMTDIALECT_H | ||
|
||
#include "circt/Support/LLVM.h" | ||
#include "mlir/IR/BuiltinOps.h" | ||
#include "mlir/IR/Dialect.h" | ||
|
||
// Pull in the dialect definition. | ||
#include "circt/Dialect/SMT/SMTDialect.h.inc" | ||
#include "circt/Dialect/SMT/SMTEnums.h.inc" | ||
|
||
#endif // CIRCT_DIALECT_SMT_SMTDIALECT_H |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
//===- SMTDialect.td - SMT dialect definition --------------*- tablegen -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMTDIALECT_TD | ||
#define CIRCT_DIALECT_SMT_SMTDIALECT_TD | ||
|
||
include "mlir/IR/DialectBase.td" | ||
|
||
def SMTDialect : Dialect { | ||
let name = "smt"; | ||
let summary = "a dialect that models satisfiability modulo theories"; | ||
let cppNamespace = "circt::smt"; | ||
|
||
let useDefaultTypePrinterParser = 1; | ||
|
||
let extraClassDeclaration = [{ | ||
void registerTypes(); | ||
}]; | ||
} | ||
|
||
#endif // CIRCT_DIALECT_SMT_SMTDIALECT_TD |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
//===- SMTOps.h - SMT dialect operations ------------------------*- C++ -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMTOPS_H | ||
#define CIRCT_DIALECT_SMT_SMTOPS_H | ||
|
||
#include "mlir/IR/OpImplementation.h" | ||
#include "mlir/IR/SymbolTable.h" | ||
#include "mlir/Interfaces/InferTypeOpInterface.h" | ||
#include "mlir/Interfaces/SideEffectInterfaces.h" | ||
|
||
#include "circt/Dialect/SMT/SMTDialect.h" | ||
#include "circt/Dialect/SMT/SMTTypes.h" | ||
|
||
#define GET_OP_CLASSES | ||
#include "circt/Dialect/SMT/SMT.h.inc" | ||
|
||
#endif // CIRCT_DIALECT_SMT_SMTOPS_H |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
//===- SMTOps.td - SMT dialect operations ------------------*- tablegen -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMTOPS_TD | ||
#define CIRCT_DIALECT_SMT_SMTOPS_TD | ||
|
||
include "circt/Dialect/SMT/SMTDialect.td" | ||
include "circt/Dialect/SMT/SMTTypes.td" | ||
include "mlir/IR/EnumAttr.td" | ||
include "mlir/IR/OpAsmInterface.td" | ||
include "mlir/Interfaces/InferTypeOpInterface.td" | ||
include "mlir/Interfaces/SideEffectInterfaces.td" | ||
|
||
class SMTOp<string mnemonic, list<Trait> traits = []> : | ||
Op<SMTDialect, mnemonic, traits>; | ||
|
||
|
||
#endif // CIRCT_DIALECT_SMT_SMTOPS_TD |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
//===- SMTTypes.h - SMT dialect types ---------------------------*- C++ -*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef CIRCT_DIALECT_SMT_SMTTYPES_H | ||
#define CIRCT_DIALECT_SMT_SMTTYPES_H | ||
|
||
#include "mlir/IR/BuiltinTypes.h" | ||
#include "mlir/IR/Types.h" | ||
|
||
#define GET_TYPEDEF_CLASSES | ||
#include "circt/Dialect/SMT/SMTTypes.h.inc" | ||
|
||
#endif // CIRCT_DIALECT_SMT_SMTTYPES_H |
Oops, something went wrong.