Skip to content

Conversation

ivg
Copy link
Member

@ivg ivg commented Apr 2, 2021

This PR introduces a powerful but convenient way of describing complex argument passing semantics. Describing even simple facets of argument passing mechanics
it tedious as it requires taking into consideration a lot of factors, such as the availability of registers, their width, the types of the passed data, its alignment requirements and so on. It is extremely hard to write a generic algorithm that will be able to accommodate the large variety of options and be at the same time useful and easy to use. Instead, we decided to pick a parser combinator approach as naturally we want to describe how to parse (or recognize) data that is passed to us by the caller. The proposed DSL is a choice monad (i.e., it has [choice] and [reject] operators that enable backtracking) so that we can try various options in order. For example, the riscv integer calling convention prescribes that double words (words twice as big as the register size) are passed in an aligned (the first register's number should be even) pair of registers, or, if there is only one register, then the first half of the word is passed in the register and the second in the memory, and if no registers is available at all, then it should be passed via stack. This complex rule is easily expressible in our DSL,

Arg.choice [
 Arg.sequence [
   Arg.align_even regs;
   Arg.registers ~limit:2 regs t;
 ];
 Arg.split_with_memory regs t;
 Arg.memory t;
]

and the whole calling convention of RISCV (which is non-trivial) is described in several lines of code,

(* integer calling convention *)
let integer regs t =
  Arg.count regs t >>= function
  | None -> Arg.reject ()
  | Some 1 -> Arg.choice [
      Arg.register regs t;
      Arg.memory t;
    ]
  | Some 2 -> Arg.choice [
      Arg.sequence [
        Arg.align_even regs;
        Arg.registers ~limit:2 regs t;
      ];
      Arg.split_with_memory regs t;
      Arg.memory t;
    ]
  | Some _ -> Arg.reference regs t

(* floating-point calling convention *)
let float iregs fregs t =
  Arg.count fregs t >>= function
  | Some 1 -> Arg.choice [
      Arg.register fregs t;
      Arg.register iregs t;
      Arg.memory t;
    ]
  | _ -> integer iregs t

The eDSL is written with the considerations of all major calling conventions and ABI but right now we use it only to describe RISCV (and the coming in the next PR aarch64). We have plans to gradually use the new DSL to describe more completely and precisely the semantics of argument passing for the existing architectures (volunteers are welcome, the language is thoroughly documented and comes with examples). One of the best things with eDSL approach is that it is naturally extensible, so that if we need some non-trivial argument-passing operator we can easily extend the language. So if you hit any problems in expressing your ABI using it, don't hesitate to add a new operator or create an issue requesting it.

Additional Changes

  • adds Theory.Target.var function for easy lookup of a register by name, it is necessary for the next PR but lands here because of rebasing conflicts :)

  • adds [data_alignment] and [code_alignment] to the target description.

@ivg ivg merged commit 94b8571 into BinaryAnalysisPlatform:master Apr 2, 2021
@ivg ivg deleted the abi-edsl branch December 1, 2021 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant