-
Notifications
You must be signed in to change notification settings - Fork 79
Bugfix: Variation Selector-16/ZWJ and starting sequences in wrap() #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cd8348c to
8f981af
Compare
8f981af to
ae83d7c
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #338 +/- ##
==========================================
+ Coverage 98.42% 98.43% +0.01%
==========================================
Files 11 11
Lines 2604 2627 +23
Branches 463 468 +5
==========================================
+ Hits 2563 2586 +23
Misses 31 31
Partials 10 10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I found a bug in our implementation I will try to address in this branch, sequences are sometimes lost in rare cases. Found while developing a blessed-free version of this to be included in the wcwidth library, jquast/wcwidth#169 |
|
I have tested a "non-blessed" version of our wrap(), ljust(), rjust(), and center() methods for inclusion in upstream library, wcwidth
The above version is approximately 4-5x faster than blessed in my testing (on UDHR data), because it is not instantiating so many Sequence() temporarily and pre-processing with .padd(), but, instead running iterators for graphemes and sequences and tracking lots of indicies. Though alike, they are also a bit different in approaches. The use of grapheme clustering should fix line breaking on some languages that we may not be familiar with enough to test properly. If they are accepted to wcwidth library, we should probably just bump our wcwidth requirement in blessed and outsource the logic to the wcwidth library and maintain it there, it will probably get more attention and help there. Some key differences,
|
Closes #267.
Support emojis sequences, we already had automatic tests for them, but they have been adjusted to match that they are now "correct".
There was a long-standing bug, that if the first line of text sent to textwrap() begins with a sequence, that the sequence was lost!
Also match the "ST"-terminated variant of OSC 8 (hyperlink), rarely used, minor.
For "truncate" method, we just update it to the more complex technique used in wcwidth.wcswidth() to account for ZWJ and VS-16 and add covering tests.
For the other, we now use a translation table to remove any possible C0 or C1 control characters before calling
wcwidth.wcswidth()directly to prevent defense against erroneous -1 return values, so we can call wcswidth directly, which fixes measurement of ZWJ and VS16.