-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some emoji have incorrect width #57
Comments
Walrus operators and f-strings mean it can't run under python 2 (or even python 3.7). I have cleaned its python 2 compat codes in #58. In my opinion, python 2 compat is useless for |
that's correct, bin/update-tables.py is not meant to be python2 compatible, it is not distributed as part of the package. |
Closes #88 The implementation of wcswidth is taken from the corresponding Python library (https://github.com/jquast/wcwidth) which seems to have the most updated list of wide characters. However, note that there are still some emoji that aren't recognised correctly; see e.g. jquast/wcwidth#57. At some point in time, the wcswidth implementation should be refactored into its own library.
Major ----- Bugfix zero-with characters, closes #57, #47, #45, #39, #26, #25, #24, #22, #8, wow ! This is mostly achieved by replacing `ZERO_WIDTH_CF` with dynamic parsing by Category codes in bin/update-tables.py and putting those in the zero-wide tables. Tests ----- - `verify-table-integrity.py` exercises a "bug" of duplicated tables that has no effect, because wcswidth() first checks for zero-width, and that is preferred in cases of conflict. This PR also resolves that error of duplication. - new automatic tests for balinese, kr jamo, zero-width emoji, devanagari, tamil, kannada. - added pytest-benchmark plugin, example use: # baseline tox -epy312 -- --verbose --benchmark-save=original # compare tox -epy312 -- --verbose --benchmark-compare=.benchmarks/Linux-CPython-3.12-64bit/0001_original.json
Hi, thanks for your work on this project. It's been invaluable!
According to this document emoji presentation sequences should be treated as "East Asian Wide".
When
wcwidth
reads in theEastAsianWide.txt
file, it discards all the emoji presentation sequences it finds, rather than treating them as being wide (since it discards everything withoutW
orF
properties).The full list of 353 emojis affected is available at:
https://unicode.org/emoji/charts/emoji-variants.html
wcwidth
will report all of the emoji in the above list as having width 1 instead of width 2.I would be happy to PR this, but I'm not sure the
master
branch is clean - I noticed some walrus operators etc. despite my understanding being that this project is 2.7 compatibleThe text was updated successfully, but these errors were encountered: