Skip to content

Commit

Permalink
[SMT] Add Dialect rationale and boilerplate
Browse files Browse the repository at this point in the history
  • Loading branch information
maerhart committed Mar 12, 2024
1 parent c12c68d commit 8b4e95d
Show file tree
Hide file tree
Showing 21 changed files with 490 additions and 0 deletions.
167 changes: 167 additions & 0 deletions docs/Dialects/SMT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# SMT Dialect

This dialect provides types and operations modeling the SMT (Satisfiability
Modulo Theories) operations and datatypes commonly found in SMT-LIB and SMT
solvers.

[TOC]

## Rationale

This dialect aims to provide a unified interface for expressing SMT problems
directly within MLIR, enabling seamless integration with other MLIR dialects and
optimization passes. It models the [SMT-LIB standard
2.6](https://smtlib.cs.uiowa.edu/), but may also include features of commonly
used SMT solvers that are not part of the standard. In particular, the IR
constructs are designed to enable more interactive communication with the solver
(allowing if-statements, etc. to react on solver feedback).

The SMT dialect is motivated by the following advantages over directly printing
SMT-LIB or exporting to a solver:
* Reuse MLIR's infrastructure: passes and pass managers to select different SMT
encodings, operation builders and rewrite patterns to build SMT formulae (and
hide the solver's API in a provided lowering pass), common textual format to
share rather than dumping the solver's state, etc.
* No need to add a link-dependency on SMT solvers to CIRCT (just provide the
path to the library as an argument to the JIT runner or manually link the
produced binary against it).
* Using an SMT dialect as intermediary allows it to be mixed with concrete
computations, the debug dialect, etc. This is complicated to achieve and
reason about when building the external solver's state directly.
* Enable easy addition of new backends
* Have a common representation and infrastructure for all SMT related efforts,
such that people don't have to build their own isolated tools.

The dialect follows these design principles:
* **Solver-independent**: don't model one particular solver's API
* **Seemless interoperability** with other dialects. E.g., to allow using the
debug dialect to back-propagate counter examples
* **Small surface for errors**: try to keep the SMT dialect and its lowerings
simple to avoid mistakes, implement optimizations defensively or prove them
formally; since higher-level dialects will be lowered through the SMT dialect
to construct formal proofs it is essential that this dialect does not
introduce bugs
* **Interactive**: the IR should be designed such that it can be interleaved
with operations (from other dialects) that take the current feedback of the
solver to steer the execution of further SMT operations. It shouldn't just
model the very rigid SMT-LIB.
* Don't heavily integrate the dialect itself with CIRCT to make potential
upstreaming easy


## Dialect Structure

The SMT dialect is structured into multiple "sub-dialects", one for each of the
following theories (this separation is also made clear in the prefix of
operation and type names as indicated in parentheses):
* Core boolean logic including quantifiers and solver interaction (`smt.*`)
* Bit-vectors (`smt.bv.*`)
* Arbitrary-precision integers (`smt.int.*`)
* Arrays (`smt.array.*`)
* Floating-point numbers (`smt.real.*`)

Several operations in the core part (e.g., quantifiers, equality, etc.) allow
operands of any SMT type (including bit-vectors, arrays, etc.). Therefore, all
type and attribute declarations are part of the core.

Certain arithmetic, bitwise, and comparison operations exist for multiple
theories. For example, there exists an AND operation for booleans and one for
bit-vectors, or there exists and ADD operation for integers and bit-vectors. In
such cases, each "sub-dialect" defines its own operation specific to its
datatypes. This simplifies the operations such that an optimization or
conversion pass only using bit-vectors does not have to take care of other
potentially supported datatypes.

## Optimizations

The primary purpose of the SMT dialect is not to optimize SMT formulae. However,
SMT solvers can exhibit significant differences in runtime, even with slight
changes in input. Improving solver performance by implementing rewrite patterns
that slightly restructure the SMT formulae may be possible.

Moreover, SMT solvers may differ in terms of built-in operators. If a solver
lacks support for advanced operators, the problem can be simplified before
passing it to the solver.

## Backends

Having an SMT dialect instead of directly interpreting the IR and building an
SMT expression enables multiple different backends to be used in addition to the
application of SMT dialect level optimizations that rewrite the formulae for
faster and more predictable runtime performance of the solver backend (e.g.,
Z3).

In the following, we outline the different backend lowerings and their
advantages and disadvantages.

### LLVM IR

Lowering to LLVM IR that calls the C API of a given SMT solver is practical for
a few reasons:
* enables using LLVM JIT or compilation to a standalone binary
* easy to mix with Debug dialect to report back nice counter examples
* allows mixing concrete and symbolic executions (e.g., for dynamic BMC upper
bounds, or more dynamic interaction with the solver such as extraction of
multiple/all possible models)

However, it is solver-dependent and more complicated to implement than an
SMT-LIB printer.

### SMT-LIB

The SMT-LIB format is a standardized format supported by practically all SMT
solvers and thus an ideal target to support as many solver backends as possible.
However,
* this format is quite static and does not allow easy interaction with the
solver, in particular, it is not easily possible to make future asserts
dependent on current solver outputs,
* providing good counter-examples to the user would mean parsing the textual
model output of the solver and mapping it to an CIRCT-internal datastructure.
* it is impossible to mix symbolic and concrete executions, as well as debug
constructs (see Debug Dialect).
* it is impossible to just use the LLVM JIT compiler to directly get a result,
but instead the external solver has to be called directly, either by adding a
compile-time dependency, or using a shell call.

### C/C++

A C/C++ exporter that produces code which calls the C/C++ API of a given solver
could allow for easier debugging and allows to support solvers without C API
without the restrictions of SMT-LIB. However, this means the JIT compilation
flow would not be available.

## Handling counter-examples

SMT solvers check formulae for satisfiability. Typically, there are three kinds
of output a solver may give:
* **Satisfiable**: In this case a model (counter-example) can be provided by the
solver. However, it may not be feasible to evaluate/interpret this model for a
given SMT constant to get a constant value. This is, in particular, the case
when the SMT encoding contains quantifiers which can lead to the model
containing quantifiers as well. Solvers (e.g., Z3) usually don't evaluate
quantifiers in models (even if they are closed). If constants can be
evaluated, a counter example can be provided and back-propagated to the
source-code, e.g., using the debug dialect.
* **Unsatisfiable**: formal verification problems are typically encoded such
that this output indicates correctness; a proof can be provided by the solver,
but is often not needed
* **Unknown**: there can be various reasons why the result is unknown; a common
one is the use of symbolic functions to represent operators and encode, e.g.,
a LEC problem as a rewrite task of patterns of function applications. This is
a frequent application and the unknown result is just treated like a
satisfiable result without counter example.

## Non-Goals

* The SMT Dialect does not aim to include any operations or types that model
verification constructs not specific to SMT, i.e., things that could also be
lowered to other kind of verification systems such as inductive theorem
provers (e.g., Lean4).

## Operations

[include "Dialects/SMTOps.md"]

## Types

[include "Dialects/SMTTypes.md"]
58 changes: 58 additions & 0 deletions docs/FormalVerification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Formal Verification Tooling

Formally verifying hardware designs is a crucial step during the development
process. Various techniques exist, such as logical equivalence checking, model
checking, symbolic execution, etc. The preferred technique depends on the level
of abstraction, the kind of properties to be verified, runtime limitations, etc.
As a hardware compiler collection, CIRCT provides infrastructure to implement
formal verification tooling and already comes with a few tools for common
use-cases. This document provides an introduction to those tools and gives and
overview over the underlying infrastructure for compiler engineers who want to
use CIRCT to implement their custom verification tool.

[TOC]

## Logical Equivalence Checking

The `circt-lec` tool takes one or two MLIR input files with operations of the
HW, Comb, and Seq dialects and the names of two modules to check for
equivalence. To get to know the exact CLI command type `circt-lec --help`.

For example, the below IR contains two modules `@mulByTwo` and `@add`. Calling
`circt-lec input.mlir -c1=mulByTwo -c2=add` will output `c1 == c2` since they
are semantically equivalent.

```mlir
// input.mlir
hw.module @mulByTwo(in %in: i32, out out: i32) {
dbg.variable "in", %in : i32
%two = hw.constant 2 : i32
dbg.variable "two", %two : i32
%res = comb.mul %in, %two : i32
hw.output %res : i32
}
hw.module @add(in %in: i32, out out: i32) {
dbg.variable "in", %in : i32
%res = comb.add %in, %in : i32
hw.output %res : i32
}
```

In the above example, the MLIR file is compiled to LLVM IR which is then passed
to the LLVM JIT compiler to directly output the LEC result. Alternatively, there
are command line flags to print the LEC problem in SMT-LIB, LLVM IR, or an
object file that can be linked against the Z3 SMT solver to produce a standalone
binary.

## Bounded Model Checking

TODO

## Infrastructure Overview

This section provides and overview over the relevant dialects and passes for
building formal verification tools based on CIRCT. The
[below diagram](/includes/img/smt_based_formal_verification_infra.svg) provides
a visual overview.

<p align="center"><img src="/includes/img/smt_based_formal_verification_infra.svg"/></p>
1 change: 1 addition & 0 deletions docs/includes/img/smt_based_formal_verification_infa.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions include/circt/Dialect/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ add_subdirectory(Pipeline)
add_subdirectory(Ibis)
add_subdirectory(Seq)
add_subdirectory(Sim)
add_subdirectory(SMT)
add_subdirectory(SSP)
add_subdirectory(SV)
add_subdirectory(SystemC)
Expand Down
9 changes: 9 additions & 0 deletions include/circt/Dialect/SMT/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
add_circt_dialect(SMT smt)
add_circt_doc(SMTOps Dialects/SMTOps -gen-op-doc)
add_circt_doc(SMTTypes Dialects/SMTTypes -gen-typedef-doc -dialect smt)

set(LLVM_TARGET_DEFINITIONS SMT.td)
mlir_tablegen(SMTEnums.h.inc -gen-enum-decls)
mlir_tablegen(SMTEnums.cpp.inc -gen-enum-defs)
add_public_tablegen_target(CIRCTSMTEnumsIncGen)
add_dependencies(circt-headers CIRCTSMTEnumsIncGen)
18 changes: 18 additions & 0 deletions include/circt/Dialect/SMT/SMT.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
//===- SMT.td - SMT dialect definition ---------------------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMT_TD
#define CIRCT_DIALECT_SMT_SMT_TD

include "mlir/IR/OpBase.td"

include "circt/Dialect/SMT/SMTDialect.td"
include "circt/Dialect/SMT/SMTTypes.td"
include "circt/Dialect/SMT/SMTOps.td"

#endif // CIRCT_DIALECT_SMT_SMT_TD
20 changes: 20 additions & 0 deletions include/circt/Dialect/SMT/SMTDialect.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
//===- SMTDialect.h - SMT dialect definition --------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMTDIALECT_H
#define CIRCT_DIALECT_SMT_SMTDIALECT_H

#include "circt/Support/LLVM.h"
#include "mlir/IR/BuiltinOps.h"
#include "mlir/IR/Dialect.h"

// Pull in the dialect definition.
#include "circt/Dialect/SMT/SMTDialect.h.inc"
#include "circt/Dialect/SMT/SMTEnums.h.inc"

#endif // CIRCT_DIALECT_SMT_SMTDIALECT_H
26 changes: 26 additions & 0 deletions include/circt/Dialect/SMT/SMTDialect.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
//===- SMTDialect.td - SMT dialect definition --------------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMTDIALECT_TD
#define CIRCT_DIALECT_SMT_SMTDIALECT_TD

include "mlir/IR/DialectBase.td"

def SMTDialect : Dialect {
let name = "smt";
let summary = "a dialect that models satisfiability modulo theories";
let cppNamespace = "circt::smt";

let useDefaultTypePrinterParser = 1;

let extraClassDeclaration = [{
void registerTypes();
}];
}

#endif // CIRCT_DIALECT_SMT_SMTDIALECT_TD
23 changes: 23 additions & 0 deletions include/circt/Dialect/SMT/SMTOps.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
//===- SMTOps.h - SMT dialect operations ------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMTOPS_H
#define CIRCT_DIALECT_SMT_SMTOPS_H

#include "mlir/IR/OpImplementation.h"
#include "mlir/IR/SymbolTable.h"
#include "mlir/Interfaces/InferTypeOpInterface.h"
#include "mlir/Interfaces/SideEffectInterfaces.h"

#include "circt/Dialect/SMT/SMTDialect.h"
#include "circt/Dialect/SMT/SMTTypes.h"

#define GET_OP_CLASSES
#include "circt/Dialect/SMT/SMT.h.inc"

#endif // CIRCT_DIALECT_SMT_SMTOPS_H
23 changes: 23 additions & 0 deletions include/circt/Dialect/SMT/SMTOps.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
//===- SMTOps.td - SMT dialect operations ------------------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMTOPS_TD
#define CIRCT_DIALECT_SMT_SMTOPS_TD

include "circt/Dialect/SMT/SMTDialect.td"
include "circt/Dialect/SMT/SMTTypes.td"
include "mlir/IR/EnumAttr.td"
include "mlir/IR/OpAsmInterface.td"
include "mlir/Interfaces/InferTypeOpInterface.td"
include "mlir/Interfaces/SideEffectInterfaces.td"

class SMTOp<string mnemonic, list<Trait> traits = []> :
Op<SMTDialect, mnemonic, traits>;


#endif // CIRCT_DIALECT_SMT_SMTOPS_TD
18 changes: 18 additions & 0 deletions include/circt/Dialect/SMT/SMTTypes.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
//===- SMTTypes.h - SMT dialect types ---------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CIRCT_DIALECT_SMT_SMTTYPES_H
#define CIRCT_DIALECT_SMT_SMTTYPES_H

#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/Types.h"

#define GET_TYPEDEF_CLASSES
#include "circt/Dialect/SMT/SMTTypes.h.inc"

#endif // CIRCT_DIALECT_SMT_SMTTYPES_H
Loading

0 comments on commit 8b4e95d

Please sign in to comment.