adds a Primus Lisp fronted #1268

ivg · 2021-01-28T21:52:11Z

Introduction

This PR enables analysis of programs written in Primus Lisp, essentially making Primus Lisp a concrete syntax for Core Theory. For example, the following lisp program could be easily reified into BIL (thus Primus Lisp could be seen as a concrete syntax for BIL, or BIR, or any Core Theory implementation)

(defun example1 (x)
  (set R0 1)
  (set R1 2)
  (set R3 (+ R1 R2 (* R1 R2 3)))
  (set F1:1 (< R0 R1))
  (set F2:1 (< R0 R1 R2 R3 R4))
  (set F3:1 (< R0 (+ R0 R3)))
  (set R4 (memory-read (+ R1 (* 8 (/ (word-width) 4)))))
  (memory-write R4 (+ R3 R1))
  (malloc R0)
  (if (> R0 (* R0 R0))
      (exec-addr 0xDEADBEEF)
    (set R0 (* R0 R2 R3))
    (malloc (* R3 R0))))

It could be translated to the following BIL program,

example1:
{
  R0 := 1
  R1 := 2
  R3 := R1 + R2 + R1 * R2 * 3
  F1 := R0 < R1
  F2 := R0 < R1 & R1 < R2 & R2 < R3 & R3 < R4
  F3 := R0 < R0 + R3
  R4 := pad:32[mem[R1 + 0x40]]
  mem := mem with [R4, be]:u32 <- R3 + R1
  bap:call(malloc)
  #1 := R0 * R0 < R0
  if (#1) {
    jmp 0xDEADBEEF
  }
  else {
    R0 := R0 * R2 * R3
    bap:call(malloc)
  }
}

See this gist for the full example.

What else it could be used for?

Right now it enables us to write BIL and BIR programs (basically define Insn.t values) using a concrete Primus Lisp syntax and relying on a big library of macros and primitives. In the future, we plan to use Primus Lisp to write lifters (so that a new instruction could be added without recompilation or having to learn OCaml).

Also, having the static semantics for stubs written in Primus Lisp will make these stubs available for static analysis.

We also plan to eventually switch to Primus Lisp as our main representation as it is fully extensible (and can easily support the full extent of the Core Theory).

Why Primus Lisp, not BAP Lisp?

We had plans to introduce BAP Lisp instead of Primus Lisp as our main representation. However, we decided to abolish these plans and instead take Primus Lisp as it exists. Primus Lisp features interesting properties that (together with the planned staged compilation) will enable us to use this language for writing not only efficient lifters but static and dynamic analyses. A non-deterministic staged language fits nicely both as a language for both machine semantics and security policy specifications.

Plans

This PR brings a minimum viable product that enables reification of the subset of the Core Theory that is materializable into BIL programs. In the next step, we will introduce staged compilation so that operations on statically known values (values obtained from the knowledge base) will be performed by the compiler itself instead of being reified into Core Theory terms. Later we will update the Primus Interpreter to enable the same staging (that will be the solution for issue #1124) so that meta-operations on the knowledge base, dictionaries, taint-tracker state and so on won't be reflected on the Primus Interpreter machine observations and became totally transparent. We will also implement the complete set of Core Theory operations, and implement reification to Primus Lisp, to enable translation in the other direction (BIL -> Lisp) which will finally make it possible to get rid of BIL and use Primus Lisp as the main representation.

How it is implemented?

The idea is simple and applies to any other language, so this PR could also be seen as a blueprint for adding a frontend for a programming language in BAP. We store the Primus Lisp program in the knowledge base, packed as a unit, using the existing source and language properties. The semantics of operations is provided the same way it is provided for machine instructions, or BIL itself, via a knowledge base rule. The main interesting point is that Primus Lisp has an arbitrary and extensible number of primitives, which are also represented as promises. This PR basically introduces zero abstractions (other than the abstract type[e for Primus primitive which is just a pair of a name and a list of arguments) and fully relies on the existing Core Theory abstractions.

primitives are still not published

The new implementation is more tightly integrated with the knowledge base and will totally rely on promises to provide semantics of lisp definitions as well as lisp primitives. There will not be anymore the special type for the semantics of a lisp program, instead a KB value of Theory.Semantics.cls will be used. The class was extended with the value slot, so no effects can have values.

right now the equality is purely textual equality.

and implements a few primitives, far from complete set though.

By removing a request to compute a name when the unit is not a lisp unit. It helps since the name computation involves the stub resolver, which in turn requires the semantics of the nearby instructions, and so on. In general, the rel-symbolizer shall be rewritten as it is so easy to break it.

ivg added 19 commits January 15, 2021 14:55

implements some basic operations

2d5cec3

implements primus semantics primitives

59821e7

publishes the semantics module

79b9762

primitives are still not published

passes the target to the program and uses it in reification

f2a4e96

adds the top value and the experimental match module

b81b37d

drops the no longer necessary target from the lisp program

467b1f5

implements equality on primus lisp programs and sources

f4f65fc

right now the equality is purely textual equality.

implements the semantics promise that provides reification of lisp

b0e5690

publishes the minimal interface

b95b09c

publishes a KB.Value.resort function

36f350d

prevents bir reifier from generating empty graphs.

0bf3c89

more fixes to the lisp lifter

2f79814

integrates semantic primitives with the main lisp module

e77dcdd

and implements a few primitives, far from complete set though.

adds more primitives

1a5c097

we have an MVP Lisp compiler

c23d285

rectifies the interface a little bit, introduces a few helpers

e620d94

adds documentation and prettifies the interface a bit

c5ea235

ivg merged commit 155a3d1 into BinaryAnalysisPlatform:master Jan 29, 2021

ivg deleted the primus-lisp-theory branch December 1, 2021 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

adds a Primus Lisp fronted #1268

adds a Primus Lisp fronted #1268

Uh oh!

ivg commented Jan 28, 2021

Uh oh!

Uh oh!

adds a Primus Lisp fronted #1268

adds a Primus Lisp fronted #1268

Uh oh!

Conversation

ivg commented Jan 28, 2021

Introduction

What else it could be used for?

Why Primus Lisp, not BAP Lisp?

Plans

How it is implemented?

Uh oh!

Uh oh!