Description
Hey man, I'm the author of alive-progress. I'm struggling to correctly support emojis in rsalmei/alive-progress#19, and I think this project could help me.
- Please, do you really intend to keep updating this project? For every new Unicode version?
- Performance doesn't really matter to me, since I've implemented a spinner compiler just for this, but yours seems to be fast anyway. It does not use any binary extension, do it? I'm asking because there's a
cython
folder with a few .c files in mysite-packages
... - I'm only interested in correctness, and this one seems very nice. I've created a brute force test using the
emoji-test.txt
from unicode.org, and while testing several combinations of emojis, yours has only failed on the Fitz Patrick skin tone modifiers when used alone (but the unicode spec states that they should be used as a normal emoji when used alone):
My brute force validation ensures all chars described on that file are detected, even when concatenated with other chars. You can see in the image that it fails where: 1. two skin tones are used one after the other (I expected two graphemes, not one); 2. an ascii char followed by a skin tone and another ascii (expected three graphemes, not the skin tone of the ascii char); and 3. two ascii followed by a skin tone (same as 2. before).
But it is ok, it works in the vast majority (and the regex dependency demonstrated the same results).
So, I'm thinking now about how to continue my wide chars/emoji support:
- include your project as a dependency;
- include regex as a dependency (but it does have a binary extension, so I'm not willing to)
- implement my own regexp to detect graphemes (here I would not need actual sequences validation, just the few formats, but it's not that easy anyway)
Thank you man!