Closed
Description
Since first writing this issue, all the following have come to our attention. Originally only the first point was written here:
&[char]
doesn't provide O(1) indexing on characters, but on codepoints. a single character can be multiple codepoints (see also char.to_uppercase, which can output one or more chars!), in this case&[&str]
(or equivalent) is more appropriate.&[char]
also doesn't provide O(1) indexing on columns, for some definition of "column". a single column can be multiple codepoints.- There is confusion about what
&[char]
actually means in aPattern
. As an example"___abc___".split(&['a', 'b', 'c'][..]).collect::<Vec<_>>()
makes["___", "", "", "___"]
but some may expect it to make["___", "___"]
.- One way someone may expect this is as Pattern for
&[char]
being UTF-32. It is not. It is well-documented that Rust doesn't support UTF-32 in std. - Another way is "any combination of these chars". It is also not, as that would break
str.strip_prefix
.
- One way someone may expect this is as Pattern for
- Even if there wasn't such confusion,
&[char]
may be slower than a hypothetical naive&[&str]
for a small enough n, as the former requires encoding/decoding UTF-8 whereas the latter can rely on UTF-8 being a self-synchronizing prefix-free code and just match bytes. - If you actually wanted to use an
&[char]
as aPattern
, but you wanted to uppercase the chars for some reason... well, see (1). - For large n,
&[char]: Pattern
is slow. You should under no circumstances use('a'..='z').collect::<Vec<_>>().as_slice()
as aPattern
. Use|c| matches!(c, 'a'..='z')
instead.
As such, there are a whole lot of issues with &[char]
in practice. We think it's a good idea to lint against it.