implements ABI description eDSL #1290

ivg · 2021-04-02T18:31:42Z

This PR introduces a powerful but convenient way of describing complex argument passing semantics. Describing even simple facets of argument passing mechanics
it tedious as it requires taking into consideration a lot of factors, such as the availability of registers, their width, the types of the passed data, its alignment requirements and so on. It is extremely hard to write a generic algorithm that will be able to accommodate the large variety of options and be at the same time useful and easy to use. Instead, we decided to pick a parser combinator approach as naturally we want to describe how to parse (or recognize) data that is passed to us by the caller. The proposed DSL is a choice monad (i.e., it has [choice] and [reject] operators that enable backtracking) so that we can try various options in order. For example, the riscv integer calling convention prescribes that double words (words twice as big as the register size) are passed in an aligned (the first register's number should be even) pair of registers, or, if there is only one register, then the first half of the word is passed in the register and the second in the memory, and if no registers is available at all, then it should be passed via stack. This complex rule is easily expressible in our DSL,

Arg.choice [
 Arg.sequence [
   Arg.align_even regs;
   Arg.registers ~limit:2 regs t;
 ];
 Arg.split_with_memory regs t;
 Arg.memory t;
]

and the whole calling convention of RISCV (which is non-trivial) is described in several lines of code,

(* integer calling convention *)
let integer regs t =
  Arg.count regs t >>= function
  | None -> Arg.reject ()
  | Some 1 -> Arg.choice [
      Arg.register regs t;
      Arg.memory t;
    ]
  | Some 2 -> Arg.choice [
      Arg.sequence [
        Arg.align_even regs;
        Arg.registers ~limit:2 regs t;
      ];
      Arg.split_with_memory regs t;
      Arg.memory t;
    ]
  | Some _ -> Arg.reference regs t

(* floating-point calling convention *)
let float iregs fregs t =
  Arg.count fregs t >>= function
  | Some 1 -> Arg.choice [
      Arg.register fregs t;
      Arg.register iregs t;
      Arg.memory t;
    ]
  | _ -> integer iregs t

The eDSL is written with the considerations of all major calling conventions and ABI but right now we use it only to describe RISCV (and the coming in the next PR aarch64). We have plans to gradually use the new DSL to describe more completely and precisely the semantics of argument passing for the existing architectures (volunteers are welcome, the language is thoroughly documented and comes with examples). One of the best things with eDSL approach is that it is naturally extensible, so that if we need some non-trivial argument-passing operator we can easily extend the language. So if you hit any problems in expressing your ABI using it, don't hesitate to add a new operator or create an issue requesting it.

Additional Changes

adds Theory.Target.var function for easy lookup of a register by name, it is necessary for the next PR but lands here because of rebasing conflicts :)
adds [data_alignment] and [code_alignment] to the target description.

The function looks up a variable by name.

ivg added 3 commits April 2, 2021 14:07

adds a Theory.Target.var function

778785d

The function looks up a variable by name.

implements a new ABI description DSL

be8a5db

uses the new ABI to implement riscv abi

f9b028f

ivg merged commit 94b8571 into BinaryAnalysisPlatform:master Apr 2, 2021

ivg deleted the abi-edsl branch December 1, 2021 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

implements ABI description eDSL #1290

implements ABI description eDSL #1290

Uh oh!

ivg commented Apr 2, 2021

Uh oh!

Uh oh!

implements ABI description eDSL #1290

implements ABI description eDSL #1290

Uh oh!

Conversation

ivg commented Apr 2, 2021

Additional Changes

Uh oh!

Uh oh!