@@ -29,7 +29,7 @@ You may also be interested in the [grammar].
29
29
30
30
# Notation
31
31
32
- Rust's grammar is defined over Unicode codepoints , each conventionally denoted
32
+ Rust's grammar is defined over Unicode code points , each conventionally denoted
33
33
` U+XXXX ` , for 4 or more hexadecimal digits ` X ` . _ Most_ of Rust's grammar is
34
34
confined to the ASCII range of Unicode, and is described in this document by a
35
35
dialect of Extended Backus-Naur Form (EBNF), specifically a dialect of EBNF
53
53
- Square brackets are used to group rules.
54
54
- ` LITERAL ` is a single printable ASCII character, or an escaped hexadecimal
55
55
ASCII code of the form ` \xQQ ` , in single quotes, denoting the corresponding
56
- Unicode codepoint ` U+00QQ ` .
56
+ Unicode code point ` U+00QQ ` .
57
57
- ` IDENTIFIER ` is a nonempty string of ASCII letters and underscores.
58
58
- The ` repeat ` forms apply to the adjacent ` element ` , and are as follows:
59
59
- ` ? ` means zero or one repetition
@@ -66,9 +66,9 @@ This EBNF dialect should hopefully be familiar to many readers.
66
66
67
67
## Unicode productions
68
68
69
- A few productions in Rust's grammar permit Unicode codepoints outside the ASCII
69
+ A few productions in Rust's grammar permit Unicode code points outside the ASCII
70
70
range. We define these productions in terms of character properties specified
71
- in the Unicode standard, rather than in terms of ASCII-range codepoints . The
71
+ in the Unicode standard, rather than in terms of ASCII-range code points . The
72
72
section [ Special Unicode Productions] ( #special-unicode-productions ) lists these
73
73
productions.
74
74
@@ -91,10 +91,10 @@ production. See [tokens](#tokens) for more information.
91
91
92
92
## Input format
93
93
94
- Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8.
94
+ Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.
95
95
Most Rust grammar rules are defined in terms of printable ASCII-range
96
- codepoints , but a small number are defined in terms of Unicode properties or
97
- explicit codepoint lists. [ ^ inputformat ]
96
+ code points , but a small number are defined in terms of Unicode properties or
97
+ explicit code point lists. [ ^ inputformat ]
98
98
99
99
[ ^ inputformat ] : Substitute definitions for the special Unicode productions are
100
100
provided to the grammar verifier, restricted to ASCII range, when verifying the
@@ -147,7 +147,7 @@ comments beginning with exactly one repeated asterisk in the block-open
147
147
sequence (` /** ` ), are interpreted as a special syntax for ` doc `
148
148
[ attributes] ( #attributes ) . That is, they are equivalent to writing
149
149
` #[doc="..."] ` around the body of the comment (this includes the comment
150
- characters themselves, ie ` /// Foo ` turns into ` #[doc="/// Foo"] ` ).
150
+ characters themselves, i.e. ` /// Foo ` turns into ` #[doc="/// Foo"] ` ).
151
151
152
152
` //! ` comments apply to the parent of the comment, rather than the item that
153
153
follows. ` //! ` comments are usually used to display information on the crate
@@ -330,14 +330,14 @@ Some additional _escapes_ are available in either character or non-raw string
330
330
literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
331
331
following forms:
332
332
333
- * An _ 8-bit codepoint escape _ escape starts with ` U+0078 ` (` x ` ) and is
334
- followed by exactly two _ hex digits_ . It denotes the Unicode codepoint
333
+ * An _ 8-bit code point escape _ starts with ` U+0078 ` (` x ` ) and is
334
+ followed by exactly two _ hex digits_ . It denotes the Unicode code point
335
335
equal to the provided hex value.
336
- * A _ 24-bit codepoint escape_ starts with ` U+0075 ` (` u ` ) and is followed
336
+ * A _ 24-bit code point escape_ starts with ` U+0075 ` (` u ` ) and is followed
337
337
by up to six _ hex digits_ surrounded by braces ` U+007B ` (` { ` ) and ` U+007D `
338
- (` } ` ). It denotes the Unicode codepoint equal to the provided hex value.
338
+ (` } ` ). It denotes the Unicode code point equal to the provided hex value.
339
339
* A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
340
- (` r ` ), or ` U+0074 ` (` t ` ), denoting the unicode values ` U+000A ` (LF),
340
+ (` r ` ), or ` U+0074 ` (` t ` ), denoting the Unicode values ` U+000A ` (LF),
341
341
` U+000D ` (CR) or ` U+0009 ` (HT) respectively.
342
342
* The _ backslash escape_ is the character ` U+005C ` (` \ ` ) which must be
343
343
escaped in order to denote * itself* .
@@ -407,7 +407,7 @@ Some additional _escapes_ are available in either byte or non-raw byte string
407
407
literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
408
408
following forms:
409
409
410
- * An _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is
410
+ * A _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is
411
411
followed by exactly two _ hex digits_ . It denotes the byte
412
412
equal to the provided hex value.
413
413
* A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
@@ -697,9 +697,9 @@ in macro rules). In the transcriber, the designator is already known, and so
697
697
only the name of a matched nonterminal comes after the dollar sign.
698
698
699
699
In both the matcher and transcriber, the Kleene star-like operator indicates
700
- repetition. The Kleene star operator consists of ` $ ` and parens , optionally
700
+ repetition. The Kleene star operator consists of ` $ ` and parenthesis , optionally
701
701
followed by a separator token, followed by ` * ` or ` + ` . ` * ` means zero or more
702
- repetitions, ` + ` means at least one repetition. The parens are not matched or
702
+ repetitions, ` + ` means at least one repetition. The parenthesis are not matched or
703
703
transcribed. On the matcher side, a name is bound to _ all_ of the names it
704
704
matches, in a structure that mimics the structure of the repetition encountered
705
705
on a successful match. The job of the transcriber is to sort that structure
@@ -1209,9 +1209,9 @@ the guarantee that these issues are never caused by safe code.
1209
1209
1210
1210
[ noalias ] : http://llvm.org/docs/LangRef.html#noalias
1211
1211
1212
- ##### Behaviour not considered unsafe
1212
+ ##### Behavior not considered unsafe
1213
1213
1214
- This is a list of behaviour not considered * unsafe* in Rust terms, but that may
1214
+ This is a list of behavior not considered * unsafe* in Rust terms, but that may
1215
1215
be undesired.
1216
1216
1217
1217
* Deadlocks
@@ -1304,7 +1304,7 @@ specific type, but may implement several different traits, or be compatible with
1304
1304
several different type constraints.
1305
1305
1306
1306
For example, the following defines the type ` Point ` as a synonym for the type
1307
- ` (u8, u8) ` , the type of pairs of unsigned 8 bit integers. :
1307
+ ` (u8, u8) ` , the type of pairs of unsigned 8 bit integers:
1308
1308
1309
1309
```
1310
1310
type Point = (u8, u8);
@@ -1958,7 +1958,7 @@ type int8_t = i8;
1958
1958
1959
1959
### Crate-only attributes
1960
1960
1961
- - ` crate_name ` - specify the this crate's crate name.
1961
+ - ` crate_name ` - specify the crate's crate name.
1962
1962
- ` crate_type ` - see [ linkage] ( #linkage ) .
1963
1963
- ` feature ` - see [ compiler features] ( #compiler-features ) .
1964
1964
- ` no_builtins ` - disable optimizing certain code patterns to invocations of
@@ -3432,7 +3432,7 @@ is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
3432
3432
UTF-32 string.
3433
3433
3434
3434
A value of type ` str ` is a Unicode string, represented as an array of 8-bit
3435
- unsigned bytes holding a sequence of UTF-8 codepoints . Since ` str ` is of
3435
+ unsigned bytes holding a sequence of UTF-8 code points . Since ` str ` is of
3436
3436
unknown size, it is not a _ first-class_ type, but can only be instantiated
3437
3437
through a pointer type, such as ` &str ` or ` String ` .
3438
3438
0 commit comments