Skip to content

Conversation

ivg
Copy link
Member

@ivg ivg commented Mar 26, 2020

BIL and Core Theory variable literals can contain only a specific
set of valid characters. This approach doesn't play nice with various
manglers and non-C languages as well as contradicts with our own
tradition as we generally tend to allow any character in variables,
see our Knowledge variables for example (and which are used underneath
the hood of CT and BIL variables).

One of the real-world examples where bap fails is the new C++ code
with lambdas, that uses # for anonymous variables.

In this proposal we allow any string (including the empty one) to be
used as a variable name. First of all, we escape any non-printable or
whitespace characters. We also escape '.' as we use it for variable
versioning. Next, if a prospective variable name starts from a digit,
is empty, or starts with $ or # (which we use to encode de-Bruijn
style variables) we prefix them with _.

Note, that escaping whitespaces and non-printable characters is really
not necessary, but helps with the textual representation of the
the variable name, in case, if we hit one.

BIL and Core Theory variable literals can contain only a specific
set of valid characters. Thus approach doesn't play nice with various
manglers and non-C languages as well as contradicts with our own
tradition as we generally tend to allow any character in variables,
see our Knowledge variables for example (and which are used underneath
the hood of CT and BIL variables).

One of the real-world examples where bap fails is the new C++ code
with lambdas, that uses `#` for anonymous variables.

In this proposal we allow any string (including the empty one) to be
used as a variable name. First of all, we escape any non-printable or
whitespace characters. We also escape '.' as we use it for variable
versioning. Next, if a prospective variable name starts from a digit,
is empty, or starts with `$` or `#` (which we use to encode de-bruijin
style variables) we prefix them with `_`.

Note, that escaping whitespaces and non-printable characters is really
not necessary, but helps with the textual representation of the
variable name, in case if we hit one.
@ivg ivg requested a review from gitoleg March 26, 2020 16:56
@ivg ivg merged commit cfeacbf into BinaryAnalysisPlatform:master Mar 26, 2020
@ivg ivg deleted the harden-variable-names branch June 10, 2020 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants