allows any character in the variable name #1086
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BIL and Core Theory variable literals can contain only a specific
set of valid characters. This approach doesn't play nice with various
manglers and non-C languages as well as contradicts with our own
tradition as we generally tend to allow any character in variables,
see our Knowledge variables for example (and which are used underneath
the hood of CT and BIL variables).
One of the real-world examples where bap fails is the new C++ code
with lambdas, that uses
#
for anonymous variables.In this proposal we allow any string (including the empty one) to be
used as a variable name. First of all, we escape any non-printable or
whitespace characters. We also escape '.' as we use it for variable
versioning. Next, if a prospective variable name starts from a digit,
is empty, or starts with
$
or#
(which we use to encode de-Bruijnstyle variables) we prefix them with
_
.Note, that escaping whitespaces and non-printable characters is really
not necessary, but helps with the textual representation of the
the variable name, in case, if we hit one.