Bugfix accounting for Variation Selector 16#97
Merged
Conversation
c577a41 to
3b47930
Compare
Closes #96 - Add new table, `VS16_NARROW_TO_WIDE`. It has only one version, "9.0.0". This defines a set of characters that are otherwise Narrow, like '0', that become wide when combined with `U+FE0F`, "VARIATION SELECTOR 16". - `wcwidth.wcswidth()` function now tracks "last measured character", and, on U+FE0F, checks that character in table VS16_NARROW_TO_WIDE, and, if matching, adds 1 to the measured width. - add `verify-table-integrity.py`, this is an unrelated file from previous work in #91 that should have been included there. - The latest list of 'emoji-zwj-sequences.txt' and 'emoji-variation-sequences.txt' are fetched by update-tables.py and placed in 'tests/' folder, and now used by automatic tests in test_emoji_zwj.py, this is helpful to ensure 100% compatibility with all latest known emoji sequences Note: A single "9.0.0" version is used because of ambiguity in legacy releases of the emoji variation sequences files. So ambiguous, that very few terminals get it right! Details are documented in update-tables.py and I will share results from 'ucs-detect' project shortly. I believe that U+FE0F is something of a "fixup" for early emojis. I don't expect any new U+FE0F sequences to be published, no changes since release 10.0
598ec00 to
488a729
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Owner
Author
|
This is tested with experimental branch of ucs-detect, sharing draft results shows 100% support for VS-16 as implemented in this branch, for ExtraTermQt, kitty, and zoc, while 91% support in iTerm2 and 88% in cool-retro-term, I think those two are in the area of ambiguity I have documented about. |
Owner
Author
|
Testing results published for this branch at URL https://ucs-detect.readthedocs.io/results.html 7 of 25 Terminals have VS-16 support: Konsole, iTerm2, Kovid Goyle's kitty, Terminal.exe, Zoc, ExtratermQt, and cool-retro-term and 2 have partial support, cmd.exe and ConsoleZ |
1 task
Closed
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #96
Add new table,
VS16_NARROW_TO_WIDE. It has only one version, "9.0.0". This defines a set of characters that are otherwise Narrow, like '0', that become wide when combined withU+FE0F, "VARIATION SELECTOR 16".change
wcwidth.wcswidth()function, now tracks "last measured character", and, on U+FE0F, checks that character in table VS16_NARROW_TO_WIDE, and, if matching, adds 1 to the measured width.add
verify-table-integrity.py, this is an unrelated file from previous work in Bugfixes for zero-width characters #91 that should have been included there.new tests: The latest list of 'emoji-zwj-sequences.txt' and 'emoji-variation-sequences.txt' are fetched by update-tables.py and placed in 'tests/' folder, and now used by automatic tests in test_emoji_zwj.py, this is helpful to ensure 100% compatibility with all latest known emoji sequences
fix issue with codecov.io token
Note: A single "9.0.0" version is used because of ambiguity in legacy releases of the emoji variation sequences files. So ambiguous, that very few terminals get it right! Details are documented in update-tables.py and I will share results from 'ucs-detect' project shortly.
I believe that U+FE0F is something of a "fixup" for early emojis. I don't expect any new U+FE0F sequences to be published, no changes since release 10.0