-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Fix potential use-after-free in HasBannedPrefixPlusDigits #4191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Clang 10 reports that the copy of the string returned by `siter.Symbol()` will be freed at the end of the line, so its c_str pointer will become invalid: ``` src/fstext/pre-determinize-inl.h:239:23: error: object backing the pointer will be destroyed at the end of the full-expression [-Werror,-Wdangling-gsl] const char *sym = siter.Symbol().c_str(); ^~~~~~~~~~~~~~ ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good vigilance on the compiler's part (although we know the string is not going anywhere, still worth fixing).
This requires a couple more changes so we won't get a warning from the next version of the compiler. :)
const char *sym = siter.Symbol().c_str(); | ||
if (!strncmp(prefix_ptr, sym, prefix_len)) { // has prefix. | ||
const string &sym = siter.Symbol(); | ||
if (!strncmp(prefix_ptr, sym.c_str(), prefix_len)) { // has prefix. | ||
if (isdigit(sym[prefix_len])) { // we don't allow prefix followed by a digit, as a symbol. | ||
// Has at least one digit. | ||
size_t pos; | ||
for (pos = prefix_len;sym[pos] != '\0'; pos++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also change this to
for (pos = prefix_len;sym[pos] != '\0'; pos++) | |
for (pos = prefix_len; pos < sym.length(); ++pos) |
because accessing the string beyond the last character using the operator[]()
is technically an UB. length()
is guaranteed O(1) by the C++11 and beyond (and the compiler knows that the string is const in any case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
siter.Symbol()
returns a copy of the string, and this temporary string may be freed at the end of the expression.
C++11 specifies that str[str.length()]
returns '\0': see operator[]. Should I still change this as you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I did not know that C++11 defined this! No, no changes required then, LGTM.
const char *sym = siter.Symbol().c_str(); | ||
if (!strncmp(prefix_ptr, sym, prefix_len)) { // has prefix. | ||
const string &sym = siter.Symbol(); | ||
if (!strncmp(prefix_ptr, sym.c_str(), prefix_len)) { // has prefix. | ||
if (isdigit(sym[prefix_len])) { // we don't allow prefix followed by a digit, as a symbol. | ||
// Has at least one digit. | ||
size_t pos; | ||
for (pos = prefix_len;sym[pos] != '\0'; pos++) | ||
if (!isdigit(sym[pos])) break; | ||
if (sym[pos] == '\0') { // All remaining characters were digits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here:
if (sym[pos] == '\0') { // All remaining characters were digits. | |
if (pos == sym.length()) { // All remaining characters were digits. |
Clang 10 reports that the copy of the string returned by
siter.Symbol()
will be freed at the end of the line, so its c_str pointer will become invalid: