@@ -24,14 +24,16 @@ Literals are tokens used in [literal expressions].
2424
2525#### Characters and strings
2626
27- | | Example | ` # ` sets\* | Characters | Escapes |
28- | ----------------------------------------------| -----------------| ------------| -------------| ---------------------|
29- | [ Character] ( #character-literals ) | ` 'H' ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
30- | [ String] ( #string-literals ) | ` "hello" ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
31- | [ Raw string] ( #raw-string-literals ) | ` r#"hello"# ` | <256 | All Unicode | ` N/A ` |
32- | [ Byte] ( #byte-literals ) | ` b'H' ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
33- | [ Byte string] ( #byte-string-literals ) | ` b"hello" ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
34- | [ Raw byte string] ( #raw-byte-string-literals ) | ` br#"hello"# ` | <256 | All ASCII | ` N/A ` |
27+ | | Example | ` # ` sets\* | Characters | Escapes |
28+ | ----------------------------------------------| -----------------| ------------| -----------------| ---------------------|
29+ | [ Character] ( #character-literals ) | ` 'H' ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
30+ | [ String] ( #string-literals ) | ` "hello" ` | 0 | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII] ( #ascii-escapes ) & [ Unicode] ( #unicode-escapes ) |
31+ | [ Raw string] ( #raw-string-literals ) | ` r#"hello"# ` | <256 | All Unicode | ` N/A ` |
32+ | [ Byte] ( #byte-literals ) | ` b'H' ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
33+ | [ Byte string] ( #byte-string-literals ) | ` b"hello" ` | 0 | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
34+ | [ Raw byte string] ( #raw-byte-string-literals ) | ` br#"hello"# ` | <256 | All ASCII | ` N/A ` |
35+ | [ C string] ( #c-string-literals ) | ` c"hello" ` | 0 | non-` NUL ` ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
36+ | [ Raw C string] ( #raw-c-string-literals ) | ` cr#"hello"# ` | <256 | non-` NUL ` ASCII | ` N/A ` |
3537
3638\* The number of ` # ` s on each side of the same literal must be equivalent.
3739
@@ -328,6 +330,76 @@ b"\x52"; b"R"; br"R"; // R
328330b " \ \ x52" ; br " \x52" ; // \x52
329331```
330332
333+ ### C string and raw C string literals
334+
335+ #### C string literals
336+
337+ > ** <sup >Lexer</sup >** \
338+ > C_STRING_LITERAL :\
339+ >   ;  ; ` c" ` ( ASCII_FOR_C_STRING | BYTE_ESCAPE | STRING_CONTINUE )<sup >\* </sup > ` " ` SUFFIX<sup >?</sup >
340+ >
341+ > ASCII_FOR_C_STRING :\
342+ >   ;  ; _ any non-NUL ASCII (i.e 0x01 to 0x7F), except_ ` " ` , ` \ ` _ and IsolatedCR_
343+
344+ A non-raw _ C string literal_ is a sequence of ASCII characters and _ escapes_ ,
345+ preceded by the characters ` U+0063 ` (` c ` ) and ` U+0022 ` (double-quote), and
346+ followed by the character ` U+0022 ` . If the character ` U+0022 ` is present within
347+ the literal, it must be _ escaped_ by a preceding ` U+005C ` (` \ ` ) character.
348+ Alternatively, a C string literal can be a _ raw C string literal_ , defined
349+ below. The type of a C string literal is ` &core::ffi::CStr ` .
350+
351+ Some additional _ escapes_ are available in either C or non-raw C string
352+ literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
353+ following forms:
354+
355+ * A _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is followed by exactly
356+ two _ hex digits_ . It denotes the byte equal to the provided hex value. The
357+ byte escape sequence ` \x00 ` is forbidden, as C strings may not contain ` NUL ` .
358+ * A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
359+ (` r ` ), or ` U+0074 ` (` t ` ), denoting the bytes values ` 0x0A ` (ASCII LF),
360+ ` 0x0D ` (ASCII CR) or ` 0x09 ` (ASCII HT) respectively.
361+ * The _ backslash escape_ is the character ` U+005C ` (` \ ` ) which must be
362+ escaped in order to denote its ASCII encoding ` 0x5C ` .
363+
364+ #### Raw C string literals
365+
366+ > ** <sup >Lexer</sup >** \
367+ > RAW_C_STRING_LITERAL :\
368+ >   ;  ; ` cr ` RAW_C_STRING_CONTENT SUFFIX<sup >?</sup >
369+ >
370+ > RAW_C_STRING_CONTENT :\
371+ >   ;  ;   ;  ; ` " ` ASCII_EXCEPT_NUL<sup >* (non-greedy)</sup > ` " ` \
372+ >   ;  ; | ` # ` RAW_C_STRING_CONTENT ` # `
373+ >
374+ > ASCII_EXCEPT_NUL :\
375+ >   ;  ; _ any non-NUL ASCII (i.e. 0x01 to 0x7F)_
376+
377+ Raw C string literals do not process any escapes. They start with the
378+ character ` U+0063 ` (` c ` ), followed by ` U+0072 ` (` r ` ), followed by fewer than 256
379+ of the character ` U+0023 ` (` # ` ), and a ` U+0022 ` (double-quote) character. The
380+ _ raw string body_ can contain any sequence of non-` NUL ` ASCII characters and is terminated
381+ only by another ` U+0022 ` (double-quote) character, followed by the same number of
382+ ` U+0023 ` (` # ` ) characters that preceded the opening ` U+0022 ` (double-quote)
383+ character. A raw C string literal can not contain any non-ASCII byte.
384+
385+ All characters contained in the raw string body represent their ASCII encoding,
386+ the characters ` U+0022 ` (double-quote) (except when followed by at least as
387+ many ` U+0023 ` (` # ` ) characters as were used to start the raw string literal) or
388+ ` U+005C ` (` \ ` ) do not have any special meaning.
389+
390+ Examples for C string literals:
391+
392+ ``` rust
393+ c " foo" ; cr " foo" ; // foo
394+ c " \ " foo\ "" ; cr #"" foo "" #; // "foo"
395+
396+ c " foo #\ " # bar" ;
397+ cr ##" foo #" # bar " ##; // foo #" # bar
398+
399+ c " \ x52 " ; c " R" ; cr " R" ; // R
400+ c " \ \ x52" ; cr " \ x52 " ; // \x52
401+ ```
402+
331403### Number literals
332404
333405A _ number literal_ is either an _ integer literal_ or a _ floating-point
0 commit comments