-
Notifications
You must be signed in to change notification settings - Fork 136
Fix UTF-16 autocompletion #1129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UTF-16 autocompletion #1129
Conversation
1c50bc8 to
2759a0a
Compare
0550394 to
795b975
Compare
|
Thank you so much for your valuable advice @tompng 🙏 |
6ed23be to
a5d05eb
Compare
|
It seems that I can't re-run the build? |
|
I schedule a rerun of that job, it's a known issue and I'm working on a fix for it. |
a5d05eb to
b648d18
Compare
|
Looking at the code, looks like the change is addressing the issue with utf16? Is that's the case, can you update both commit message and the PR title? |
b648d18 to
84663b1
Compare
It is indeed 😆 |
lib/irb/completion.rb
Outdated
| # Remove BOM (U+FEFF) which may be preserved when converting from UTF-16 | ||
| converted.delete("\uFEFF") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this. The original issue is about utf-16, but another realistic case is other ascii-compatible encoding method defined by code with magic comments.
#encoding: sjis
def a【codepoint=825C letter here】; endSo we don't need to consider UTF-16 specific case.
And It looks like BOM is not preserved.
'😄'.encode(Encoding::UTF_16)
# => "\xFE\xFF\xD8\x3D\xDE\x04" (has BOM)
'😄'.encode(Encoding::UTF_16).encode(Encoding::UTF_8).chars
# => ["😄"] (BOM is removed)Other part of this pull request looks good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true for all tested Ruby implementations except TruffleRuby.
However, TruffleRuby adds the BOM in the beginning of the string after a conversion.
Benoit would rather see that Ruby would generally prevent such method definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but I don't think this should be in lib code.
Can you remove this and omit truffleruby in the added test, just like test_regexp_completor_handles_encoding_errors_gracefully does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That makes sense. 🙂
Thank you, I changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use Encoding::UTF_16LE/BE for testing, Encoding::UTF_16 is basically deprecated and a dummy encoding.
Even better would be to use something more realistic than UTF-16, since that's not even valid source encoding: #52 (comment)
So I think testing with SJIS as in @tompng's example is much more realistic and useful, using UTF-16 is basically a user mistake in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a realistic test case looks good. I'd like to check for some existing test that use UTF16LE, encoding-invalid method and add/change in a followup pull request.
84663b1 to
fed1886
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
Thank you
fixes #52
PS: I'm looking for a new adventure in case anybody is looking to hire or work with a Ruby/Rails/Crystal dev