-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix -gsource-map with UTF-8 chars in directory and file names in the path. #24560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
with open(options.output, 'w') as outfile: | ||
json.dump(map, outfile, separators=(',', ':')) | ||
with open(options.output, 'w', encoding='utf-8') as outfile: | ||
json.dump(map, outfile, separators=(',', ':'), ensure_ascii=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ensure_ascii=False
means that non-ascii chars are not escaped as they normally would be. Is that what we want for source maps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is probably the simplest. JSON readily supports UTF-8 characters in content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you verified that the resulting source maps actually work correctly in browsers? (as opposed to just having the test pass).
This one seems to be failing on our windows emscripten-releases bot: https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket/8712146499624631425/+/u/Emscripten_testsuite__other_/stdout
Looks like it's trying to print, expecting a cp1252 encoding but it's actually UTF-8? |
That's interesting. I tried to reproduce with the following setups:
and also on another Windows 10 PC, repeating the same three cases. I thought for sure setting codepage to 1252 and/or disabling Unicode support would reproduce the problem, but that doesn't seem to help. Reading the line
in the log suggests that Python version might be different on the bot vs what I am running with from Emsdk. I updated Emsdk recently to have Python 3.13.3. So I wonder if this might be a Python version thing that older version had problems printing Unicode characters? If an older Python version is used there, I wonder if it would be feasible to update? Otherwise I think skipping that one test on the bot for now could be a simple approach as well. |
Thanks for digging into this. It wouldn't surprise me if there's an issue with the particular version and configuration of Python and/or the underlying OS on these builders, as they are fairly customized for Chromium's CI, and we are just along for the ride. That also means that it's probably not worth trying to get them to update it; but since it's working on GitHub, and if you're pretty confident that everything is WAI after your investigation here, then I'm comfortable just disabling this test on the Chromium CI. |
@dschuff Would you be able to run |
According to the top line of the log in https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket/8712146499624631425/+/u/Emscripten_testsuite__other_/stdout |
which... actually I thought they were planning on updating all the builders to 3.11, so maybe ours got left behind somehow. I'll look into that. But maybe it's worth trying to reproduce on whatever the oldest version of Python we want to support in emscripten is. The reason Chrome infra is updating is because 3.8 is EOL, so maybe we want to update our min supported version as well. |
Fix -gsource-map with UTF-8 chars in directory and file names in the path. Fixes #24557.