Fix UTF-16 autocompletion #1129

alexanderadam · 2025-10-21T10:14:20Z

fixes #52

PS: I'm looking for a new adventure in case anybody is looking to hire or work with a Ruby/Rails/Crystal dev

lib/irb/completion.rb

alexanderadam · 2025-10-26T23:04:32Z

Thank you so much for your valuable advice @tompng 🙏

alexanderadam · 2025-10-27T10:46:39Z

It seems that I can't re-run the build?
Or at least I can't find how.
TruffleRuby failed on the CI while bundling and I created an issue for that on the TruffleRuby repo.

eregon · 2025-10-27T10:54:21Z

I schedule a rerun of that job, it's a known issue and I'm working on a fix for it.

st0012 · 2025-10-27T11:33:55Z

Looking at the code, looks like the change is addressing the issue with utf16? Is that's the case, can you update both commit message and the PR title?

alexanderadam · 2025-10-27T12:01:38Z

Looking at the code, looks like the change is addressing the issue with utf16? Is that's the case, can you update both commit message and the PR title?

It is indeed 😆
I changed both now to be correct.

tompng · 2025-10-27T15:29:18Z

lib/irb/completion.rb

+            # Remove BOM (U+FEFF) which may be preserved when converting from UTF-16
+            converted.delete("\uFEFF")


I think we don't need this. The original issue is about utf-16, but another realistic case is other ascii-compatible encoding method defined by code with magic comments.

#encoding: sjis def a【codepoint=825C letter here】; end

So we don't need to consider UTF-16 specific case.
And It looks like BOM is not preserved.

'😄'.encode(Encoding::UTF_16) # => "\xFE\xFF\xD8\x3D\xDE\x04" (has BOM) '😄'.encode(Encoding::UTF_16).encode(Encoding::UTF_8).chars # => ["😄"] (BOM is removed)

Other part of this pull request looks good 👍

This is true for all tested Ruby implementations except TruffleRuby.

However, TruffleRuby adds the BOM in the beginning of the string after a conversion.

Benoit would rather see that Ruby would generally prevent such method definitions.

I see, but I don't think this should be in lib code.
Can you remove this and omit truffleruby in the added test, just like test_regexp_completor_handles_encoding_errors_gracefully does?

I see. That makes sense. 🙂

Thank you, I changed it.

It would be better to use Encoding::UTF_16LE/BE for testing, Encoding::UTF_16 is basically deprecated and a dummy encoding.
Even better would be to use something more realistic than UTF-16, since that's not even valid source encoding: #52 (comment)
So I think testing with SJIS as in @tompng's example is much more realistic and useful, using UTF-16 is basically a user mistake in the first place.

Adding a realistic test case looks good. I'd like to check for some existing test that use UTF16LE, encoding-invalid method and add/change in a followup pull request.

fixes ruby#52

tompng

Looks good 👍
Thank you

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch from 1c50bc8 to 2759a0a Compare October 22, 2025 22:22

tompng reviewed Oct 24, 2025

View reviewed changes

lib/irb/completion.rb Outdated Show resolved Hide resolved

lib/irb/completion.rb Outdated Show resolved Hide resolved

lib/irb/completion.rb Outdated Show resolved Hide resolved

tompng added the hacktoberfest-accepted label Oct 25, 2025

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch 2 times, most recently from 0550394 to 795b975 Compare October 26, 2025 22:58

alexanderadam requested a review from tompng October 26, 2025 23:04

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch 2 times, most recently from 6ed23be to a5d05eb Compare October 27, 2025 09:40

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch from a5d05eb to b648d18 Compare October 27, 2025 11:22

st0012 added the bug Something isn't working label Oct 27, 2025

alexanderadam changed the title ~~fix: don't crash on utf18 autocompletion~~ fix: don't crash on utf16 autocompletion Oct 27, 2025

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch from b648d18 to 84663b1 Compare October 27, 2025 12:00

alexanderadam changed the title ~~fix: don't crash on utf16 autocompletion~~ Fix UTF-16 autocompletion Oct 27, 2025

tompng reviewed Oct 27, 2025

View reviewed changes

Fix UTF-16 autocompletion

fed1886

fixes ruby#52

alexanderadam force-pushed the fix/dont_crash_on_utf16_method_autocompletion branch from 84663b1 to fed1886 Compare October 27, 2025 21:45

tompng approved these changes Oct 28, 2025

View reviewed changes

tompng merged commit 18d152b into ruby:master Oct 28, 2025
33 checks passed

alexanderadam deleted the fix/dont_crash_on_utf16_method_autocompletion branch October 28, 2025 17:36

alexanderadam mentioned this pull request Oct 28, 2025

update test to check for UTF16LE/BE #1132

Open

		# Remove BOM (U+FEFF) which may be preserved when converting from UTF-16
		converted.delete("\uFEFF")

Fix UTF-16 autocompletion #1129

Fix UTF-16 autocompletion #1129

Conversation

alexanderadam commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexanderadam commented Oct 26, 2025

Uh oh!

alexanderadam commented Oct 27, 2025

Uh oh!

eregon commented Oct 27, 2025

Uh oh!

st0012 commented Oct 27, 2025

Uh oh!

alexanderadam commented Oct 27, 2025

Uh oh!

tompng Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexanderadam Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

tompng Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

alexanderadam Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

eregon Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tompng Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

tompng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alexanderadam commented Oct 21, 2025 •

edited

Loading

tompng Oct 27, 2025 •

edited

Loading

eregon Oct 28, 2025 •

edited

Loading