support variation selectors, ZWJ sequences and surrogate pairs in length calculation#2082
Conversation
|
@microsoft-github-policy-service agree |
|
Sorry, but I have a bit difficulty understanding these changes. I appreciate the PR but all logic for Unicode calculation should go into the wcwidth library. Why use UnicodeCalculator on net6 and above and a complete different approach otherwise? |
|
the problem is the length counting in char by char. the easy fix would have been to fix the one-by-one counting to detect the "wide" and change the result but it would have been a single fix only. this approach now fixes more cases. |
|
The string overload in UnicodeCalculator does this already afaik. |
|
I appreciate it, but I would rather get the calculation correct in wcwidth to be honest since it's used by other projects as well (the reason for it existing). I'll take a look at this PR as soon as I can. |
…airs in length calculation ( fixes spectreconsole#1847 )
8ae634d to
fbeae03
Compare
| public static int GetCellLength(ReadOnlySpan<char> text) | ||
| public static int GetCellLength(string text) | ||
| { | ||
| #if !NETSTANDARD2_0 |
There was a problem hiding this comment.
this obviosly is now not fixed for .net standard 2.0. UnicodeCalculator.GetWidth(string) is not available for .net standard?
There was a problem hiding this comment.
You are completely correct. I look forward to drop netstandard2.0 at some point (around the 2.0 release of Spectre.Console). For now, I'm OK with having two different implementations (and potentially erroneous rendering on .NET Framework). What do you think?
|
i reworked it to use the UnicodeCalculator.getstring more, this makes it a bit easier now! |
|
Ok, I've looked through it, and I think I understand the changes now. Will take it for a spin after I've merged another emoji-related bug tonight. |
patriksvensson
left a comment
There was a problem hiding this comment.
I think this looks good. No tests are failing and it solves the issue. Some minor nit-pick thing that needs to be fixed though.
|
Hi i am not sure what to do right now. i am fine with the change of extra brackets but i do not see an accept button anywhere? do i need to put this change in my branch? i see no simple accept button anywhere? |
|
I already took care of it 🙂 |
|
Merged! Thank you for your contribution. Much appreciated! 👍 |

Fixes #1847
AI
I am working with unicode, windows-1252 and codepages as part of my day job for quite some time. I used Claude Code here as a tool to write faster. I checked every little change myself.
Changes
src/Spectre.Console/Internal/Cell.csGetCellLength(string): NET6+ → delegates toUnicodeCalculator.GetWidth(string)which already handles grapheme clusters correctly viaEnumerateRunes();netstandard2.0 → iterates withStringInfo.GetTextElementEnumeratorGetCellLength(ReadOnlySpan<char>): now delegates to the string path instead of iterating char-by-charprivate GetClusterCellLength(string): FE0F → 2, ZWJ → 2, surrogate pair → code point width lookupsrc/Spectre.Console/Rendering/Segment.csSplit(): char-foreach replaced with StringInfo enumerator — prevents splits in the middle of a grapheme clusterTruncate(): same changeSplitSegment(): same changesrc/Spectre.Console.Tests/Unit/CellTests.cs(new)src/Spectre.Console.Tests/Unit/Widgets/Table/TableTests.csShould_Render_Table_With_Wide_Emoji_CorrectlyPlease upvote 👍 this pull request if you are interested in it.