util: optimize toASCIILower function using V8's native toLowerCase#61107
Conversation
bf24628 to
b391b73
Compare
b391b73 to
6b08d1a
Compare
6b08d1a to
57b817f
Compare
There was a problem hiding this comment.
This is not identical (and there is a reason why the function checks character ranges)
This function is supposed to do ASCII lowercasing, not Unicode lowercasing
TextDecoder has the same issue for label normalization btw, as it incorrectly uses .toLowerCase at
Line 332 in b1c01fc
See spec: https://tc39.es/ecma262/#sec-string.prototype.tolowercase and notes there
As an example, '\u212A'.toLowerCase() === 'k'
toLowerCase can convert non-ascii characters into ascii characters (which makes e.g. current TextDecoder label handling incorrect, among all its other bugs)
text/html charset=gb\u212A should not turn into text/html charset=gbk
|
What would be helpful is to move |
Thanks I will try this idea |
Co-authored-by: Nikita Skovoroda <chalkerx@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #61107 +/- ##
==========================================
- Coverage 88.53% 88.52% -0.01%
==========================================
Files 703 703
Lines 208546 208548 +2
Branches 40217 40224 +7
==========================================
- Hits 184634 184620 -14
- Misses 15926 15958 +32
+ Partials 7986 7970 -16
🚀 New features to boost your workflow:
|
RafaelGSS
left a comment
There was a problem hiding this comment.
LGTM. With green CI and confirmed by benchmark CI.
|
Landed in ed47077 |
PR-URL: #61107 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
PR-URL: #61107 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
PR-URL: #61107 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
PR-URL: #61107 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
PR-URL: nodejs#61107 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
benchmark results (mimetype-instantiation.js):
application/ecmascript +54%
text/html charset=gbk +24%
text/html long...=x charset=gbk +21%