Consider linting against 00B7 aka interpunct aka middle dot #120797
Open
Description
opened on Feb 8, 2024
Code
#![allow(dead_code)]
#![deny(uncommon_codepoints)]
const COL·LECCIÓ: () = ();// This is Catalan
// The below is not allowed by the lexer today...
// const ·START: () = ();
// ... but this is allowed today ...
const MID·DLE: () = ();
// ... and this is also allowed today
const END·: () = ();
fn main() {
println!("{}", r#"
COL·LECCIÓ
·START
MID·DLE
END·
"#)
}
Current output
COL·LECCIÓ
·START
MID·DLE
END·
but note that visual of the first line is font-dependent, in terms of how the columns of a fixed-width font line up; the playpen collapses the L·L into a single glyph that occupies one character width.
Desired output
I'm not certain. I just want to make sure we follow-up on PR #120695
The options I see are either:
- Leave things as they are (00B7 is hard-rejected as an initial character, and silently accepted in all other contexts)
- Adopt something like what was proposed in PR
uncommon_codepoints
: lint against 00B7 MIDDLE DOT in final position #120695: continue hard-rejecting 00B7 as an initial character; lint against its occurrence as a final character, and silently accept it as a "medial" character - Something more aggressive than PR
uncommon_codepoints
: lint against 00B7 MIDDLE DOT in final position #120695, like linting against 00B7 in all contexts (except perhaps when it occurs in between two L's, to accommodate Catalan, as suggested by Manish here) - Other options? (We probably don't get any benefit from deviating far from Unicode committee recommendations, so we probably do not want to start accepting 00B7 as an initial character)
Rationale and extra context
No response
Other cases
No response
Rust Version
Stable channel
Build using the Stable version: 1.76.0
Anything else?
No response
Activity