Skip to content

More unescaping cleanups #118897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 16, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Rename some unescaping functions.
`unescape_raw_str_or_raw_byte_str` only does checking, no unescaping.
And it also now handles C string literals.

`unescape_raw_str` is used for all the non-raw strings.
  • Loading branch information
nnethercote committed Dec 13, 2023
commit b900eb73173f416dba5619ebfc10e6e0438a4753
16 changes: 7 additions & 9 deletions compiler/rustc_lexer/src/unescape.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,8 @@ where
let res = unescape_char_or_byte(&mut chars, mode);
callback(0..(src.len() - chars.as_str().len()), res);
}
Str | ByteStr => unescape_str_common(src, mode, callback),
RawStr | RawByteStr => unescape_raw_str_or_raw_byte_str(src, mode, callback),
Str | ByteStr => unescape_non_raw_common(src, mode, callback),
RawStr | RawByteStr => check_raw_common(src, mode, callback),
CStr | RawCStr => unreachable!(),
}
}
Expand Down Expand Up @@ -122,12 +122,10 @@ where
{
match mode {
CStr => {
unescape_str_common(src, mode, callback);
unescape_non_raw_common(src, mode, callback);
}
RawCStr => {
unescape_raw_str_or_raw_byte_str(src, mode, &mut |r, result| {
callback(r, result.map(CStrUnit::Char))
});
check_raw_common(src, mode, &mut |r, result| callback(r, result.map(CStrUnit::Char)));
}
Char | Byte | Str | RawStr | ByteStr | RawByteStr => unreachable!(),
}
Expand Down Expand Up @@ -325,7 +323,7 @@ fn unescape_char_or_byte(chars: &mut Chars<'_>, mode: Mode) -> Result<char, Esca

/// Takes a contents of a string literal (without quotes) and produces a
/// sequence of escaped characters or errors.
fn unescape_str_common<F, T: From<u8> + From<char>>(src: &str, mode: Mode, callback: &mut F)
fn unescape_non_raw_common<F, T: From<u8> + From<char>>(src: &str, mode: Mode, callback: &mut F)
where
F: FnMut(Range<usize>, Result<T, EscapeError>),
{
Expand Down Expand Up @@ -392,15 +390,15 @@ where
/// sequence of characters or errors.
/// NOTE: Raw strings do not perform any explicit character escaping, here we
/// only produce errors on bare CR.
fn unescape_raw_str_or_raw_byte_str<F>(src: &str, mode: Mode, callback: &mut F)
fn check_raw_common<F>(src: &str, mode: Mode, callback: &mut F)
where
F: FnMut(Range<usize>, Result<char, EscapeError>),
{
let mut chars = src.chars();
let chars_should_be_ascii = mode.chars_should_be_ascii(); // get this outside the loop

// The `start` and `end` computation here matches the one in
// `unescape_str_common` for consistency, even though this function
// `unescape_non_raw_common` for consistency, even though this function
// doesn't have to worry about skipping any chars.
while let Some(c) = chars.next() {
let start = src.len() - chars.as_str().len() - c.len_utf8();
Expand Down