Skip to content

Conversation

@clipperhouse
Copy link
Owner

@clipperhouse clipperhouse commented Oct 11, 2025

After working to have displaywidth always return identical widths compared to go-runewidth, I’ve come to believe it’s better to choose simplicity, consistency and (IMHO) more correctness.

Copilot AI review requested due to automatic review settings October 11, 2025 22:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes the compatibility layer with go-runewidth and adopts a more consistent approach to character width calculation. The change simplifies the codebase by removing complex compatibility tests and exceptions that were previously needed to match go-runewidth behavior.

Key changes:

  • Removed extensive comparison tests between displaywidth and go-runewidth implementations
  • Simplified the test suite by replacing compatibility tests with a focused compatibility test
  • Updated property naming conventions to be more descriptive and consistent

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
width_test.go Removed comprehensive compatibility tests and replaced with simplified compatibility test focusing on known differences
width.go Updated property names from _EAW_* to _East_Asian_* format and added surrogate character handling
internal/gen/unicode.go Refactored property definitions to use a centralized approach and updated naming conventions
internal/gen/triegen/triegen.go Fixed return type generation to use proper integer types instead of hardcoded property type
internal/gen/trie.go Updated trie generation to use new property definitions and improved type handling
README.md Updated benchmarks and added compatibility section documenting differences with go-runewidth
AGENTS.md Updated development guidelines and removed references to compatibility maintenance
.changeset.md Added documentation of changes related to Unicode Cf category usage

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Repository owner deleted a comment from cursor bot Oct 11, 2025
@clipperhouse
Copy link
Owner Author

Review Summary

This PR makes a well-reasoned architectural decision to prioritize Unicode correctness over strict compatibility. The changes are generally well-executed with good documentation.

✅ Strengths

  1. Clear rationale: Using complete Unicode categories (Cf, Mn, Zl, Zp) aligns better with Unicode standards
  2. Improved code organization: Centralized reduces duplication
  3. Better documentation: README clearly documents compatibility differences
  4. Good additions: Surrogate handling (U+D800-U+DFFF) and explicit zero-width categories

⚠️ Key Considerations

  1. Breaking change: This diverges from go-runewidth behavior. Consider:

    • Adding migration guide to README
    • Appropriate semantic versioning (v2.0.0?)
  2. Performance impact: The trie expanded significantly. Recommend running benchmarks to verify no regression.

  3. Mc (Spacing Mark) handling: The exclusion of Mc from combining marks (giving them width 1) could use more documentation/justification.

  4. Noncharacters: Currently only handles U+FFFE and U+FFFF, but Unicode defines 66 total noncharacters. Should either handle all or document why only these two.

  5. Test coverage: The new is good, but could add specific test cases for each Unicode category mentioned in the README (Mn, Cf, Mc, Cs, Zl, Zp).

Minor Issues

  • Typo in AGENTS.md line 25: "attemped" → "attempted"

Recommendation

Approve with minor changes. The decision to prioritize correctness is defensible and well-documented. Main risks are breaking changes for existing users and potential performance impact—both manageable with proper communication and verification.

@clipperhouse clipperhouse merged commit 9ea1c98 into main Oct 11, 2025
8 checks passed
@clipperhouse clipperhouse deleted the drop-the-compat branch October 11, 2025 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant