-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby chokes on Windows/Russian #348
Comments
or this, in
hence
and
Now, the key thing probably is that UNUSED variable
It says few interesting things that i can not quite comprehend.
and
I wonder if it can be made "just work" by swtching it to UTF-8 or UTF-16
I can not know what this "theory" would mean in practice given all the legacy code... |
Well, "-E" option is as good as not existing
I was thinking about just modifying the "gem.cmd" and call it Hail Mary day, but no luck. Feels like dead-end on my part (short of removing that loop altogether). The doc seem to suggest, that overriding global part is possible in the sources, but WHERE to do it safely, if that is even possible at all is above my level. From abstract common sense it shouldbe OK for Ruby internals just to run full Unicode inside the "OS API" perimeter, but who knows. |
Since ruby-3.0 usually all strings from the Windows-API are returned as UTF-8 strings. This is a leftover from 2.x times. Fixes: oneclick/rubyinstaller2#348
... to ruby-3.3 and -head for now Fixes: oneclick/rubyinstaller2#348
Since ruby-3.0 usually all strings from the Windows-API are returned as UTF-8 strings. Win32::Registry so far returned OEM encoding. This was a leftover from 2.x times. This commit changes it to UTF-8. Fixes: oneclick/rubyinstaller2#348
Since ruby-3.0 usually all strings from the Windows-API are returned as UTF-8 strings. Win32::Registry so far returned OEM encoding. This was a leftover from 2.x times. This commit changes it to UTF-8. Fixes: oneclick/rubyinstaller2#348
Since ruby-3.0 usually all strings from the Windows-API are returned as UTF-8 strings. Win32::Registry partly returned OEM encoding. This was a leftover from 2.x times. This commit changes it to UTF-8. Fixes: oneclick/rubyinstaller2#348
I wanted to wet my fit in AsciiDoc, not sure if i would need Ruby at all, maybe VSCode extension would be enough. But i thought, better safe than sorry, did
winget install "ruby 3.2"
and triedgem install asciidoc
....actually, i just tried
gem
from powershell prompt.Some background: being "classic" desktop dev i know zilch about Ruby, but can speak of Win32 API on "flat C API" level.
So, here we go:
I have Git on my pc, which works like a charm being built with the said MSYS2 runtime, so the problem is not there.
U+2014
isEmDash
and of course can be reduced to DOC codepage as a simple ASCII7 "minus"U+002D
, as it ever were in pre-IBM-PC times. That said, i am not sure it is ever needed.Well, i tried to read the code...
msys2_installation.rb
If i read the diagnostic correctly, this is where it chokes.
There is
if subreg['DisplayName'] =~ /^MSYS2 /
later, but feels it never gets there.For example i have VSCode installed (
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{771FD6B0-FA20-440A-A002-3B3BAC16DC50}_is1
) and i have Python (HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{3d45edf4-44bb-483f-9e08-43c38c81e118}
) withDisplayName
set asPython 3.11.4 (64-bit)
and even Ruby itself has dashes in the nameHKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\RubyInstaller-3.2-x64-mingw-ucrt_is1
Here is the access log, but nothign feels wrong there. Probably Ruby RTL first caches the dataset from registry, then iterates (and converts) that dataset to strings.
Now...
ibm866
isGetOemCP
orCP_OEMCP
in Windows terms, a TUI (Text user interface) charset intended for non-graphic Windows apps. So the idea to convert it is generally wise, but in this specific place it feels misplaced.Most of Windows API is UTF-16 based.
Like i said, i know zilch about Ruby but quick googling suggests Ruby string variables can have any charset at will: https://ruby-doc.org/core-2.5.3/String.html
Then WHY would anyone convert it there rather than keeping them UTF-16LE ???
First, you sorta-kinda can switch the user interface to UTF-8, albeit with caveats:
However, further reading the code suggests you care not about user interaction there at all, you only need
il=subreg['InstallLocation']
. Now, indeed, folder paths CAN be full unicode and be thus inaccessible from classic, pre-unicode applications. It is bad style, but it IS possible, technically.So, the proper question, i guess, would be WHY to leave UTF16 realm and reduce the strings to windows-866 instead? The next step you would most probably do would be back-converting it to UTF16 so you can call file I/O API, like opening files, enumerating folders, etc.
So...
P.S. i did RegEdit search and it appears i do not have "MSYS2" anywhere in my registry. Guess, it is only different for MSYS2 develoeprs themselves. So, basically, the Ruby fails fatally over attempting to do the search guaranteed to return empty set for 99% of computers... :-/
P.P.S. i tried to guesstimate what on Earth coerces Ruby there to do the unneeded string converions, my eye stumbled on the obvious typo-error there (the sword swing is "slaSHing" not "slaCHing"):
If i apprehend it, then it is https://ruby-doc.org/core-2.5.3/String.html#method-i-gsub
Well, again, nothing there hints aat any pre-configured and fixed string charset, so i still fail to grasp why that fragile and redundant conversion ever gets kicked in in the fist place...
The text was updated successfully, but these errors were encountered: