Skip to content

Warn that Chars iterator does not iterate "characters" #26689

Closed
@kornelski

Description

@kornelski

The Chars iterator iterates over Unicode Scalar Values, but when people think about "characters" they usually mean something closer to what Unicode calls "grapheme clusters".

This leads to surprising results and thus potential errors, for example:

"éé".chars().count() == 3
"🇺🇸".chars().count() == 2

I suggest adding a warning to the documentation that this iterator isn't iterating over "characters", and that users should consider using UnicodeSegmentation::graphemes iterator instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions