Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance impact of rearranging "can combine backwards" bit #4967

Open
hsivonen opened this issue May 30, 2024 · 1 comment
Open
Labels
2.0-breaking Changes that are breaking API changes A-performance Area: Performance (CPU, Memory) C-collator Component: Collation, normalization

Comments

@hsivonen
Copy link
Member

For characters that are their own decomposition, the least significant bit signifies "can combine backwards". As of Unicode 16, this information is also needed for complex decompositions, but the same bit was already taken, so the second-least-significant bit is used (by #4860).

Investigate the performance impact of flipping around the two bit allocations for complex decompositions and unifying the "can combine backwards" bit check.

@hsivonen hsivonen added A-performance Area: Performance (CPU, Memory) C-collator Component: Collation, normalization 2.0-breaking Changes that are breaking API changes labels May 30, 2024
@sffc sffc added this to the ICU4X 2.0 milestone May 30, 2024
@sffc
Copy link
Member

sffc commented May 30, 2024

Seems like something that would be beneficial to do in 2.0. Anyone can take this and @hsivonen has left enough of a trail. Perhaps @echeran

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0-breaking Changes that are breaking API changes A-performance Area: Performance (CPU, Memory) C-collator Component: Collation, normalization
Projects
Status: Investigate
Development

No branches or pull requests

2 participants