-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normative: Revert U+2212 (Unicode minus sign) as a valid character in timezone offsets #3334
Conversation
This PR is the Temporal counterpart to tc39/ecma262#3334. See that PR for details. This PR is dependent on that PR's approval.
FWIW as far as I understand it, this change would not be observable if made in ECMA-262 today, without Temporal. (But it should probably be discussed in plenary anyway.) |
Clarifying @ptomato's comment above: this U+2212 character is currently only exposed via Temporal, because the only way an offset can get into So the current plan is to present a normative Temporal PR in Helsinki to remove U+2212 from the Temporal grammar, and mention as part of this discussion that consensus on that PR also means percolating this grammar change back to 262. Sound OK? |
It's normative for ECMA-402, where |
concrete example: const dtf = new Intl.DateTimeFormat("pt", {
dateStyle: "short",
timeStyle: "full",
timeZone: "\u{2212}01:00",
});
dtf.format(new Date("2024-01-01T00:00Z"))
// => "31/12/2023, 23:00:00 GMT-01:00" |
This normative change reached consensus at the TC39 meeting of 2024-06-12. |
This PR is the Temporal counterpart to tc39/ecma262#3334. See that PR for details. This PR is dependent on that PR's approval.
… IDs These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
Tests are in tc39/test262#4120. (The first commit specifically contains tests for this PR. The second commit for the companion Temporal PR.) |
This comment was marked as spam.
This comment was marked as spam.
c44edd4
to
7c3ae1d
Compare
I rebased to latest main. Hopefully ready to merge? |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
We typically require 2 editor approvals for anything that's not completely trivial. |
… IDs These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
… IDs These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
These are the test adjustments corresponding to the normative PR tc39/ecma262#3334 which reached consensus at the June 2024 TC39 meeting.
This PR is the Temporal counterpart to tc39/ecma262#3334. See that PR for details. This PR is dependent on that PR's approval.
This PR is the Temporal counterpart to tc39/ecma262#3334. See that PR for details. This PR is dependent on that PR's approval.
This comment was marked as spam.
This comment was marked as spam.
…ffsets. r=spidermonkey-reviewers,mgaudet Also simplify the implementation to use normal string comparison instead of using character codes. Changes from: tc39/ecma262#3334 Differential Revision: https://phabricator.services.mozilla.com/D216275
Following ISO-8601, tc39#2781 introduced U+2212 (Unicode minus) as an alias for the regular ASCII minus sign for use in time zone offsets. There's two new data that lead me to believe that this was a mistake, and that we should revert this change. The first is that the newly-released RFC 9557 (the string format standard that Temporal uses) disallows non-ASCII characters. Its predecessor RFC 3339 also disallows non-ASCII characters. So strings that follow the current (since 2022) ECMAScript spec could be rejected by RFC 9557 clients. The second new data is feedback from implementers of a Rust version of Temporal that this single obscure character in the grammar will incur a performance cost because they must now use Rust strings instead of plain U8 ASCII data. See tc39/proposal-temporal#2843 (comment) This performance issue doesn't seem to be limited to Rust. Any native implementation would likely benefit from being able to know that valid date/time input (both Date and Temporal) is always ASCII-only. I don't know whether all engines have actually implemented this 2022 grammar change. But it's also a safe bet that real-world usage of this Unicode character is likely minimal. So the web-compat risk seems small. If this PR is accepted, then we'll follow up with a normative Temporal PR to remove this character from Temporal as well.
7c3ae1d
to
e7f1e5d
Compare
…ffsets. r=spidermonkey-reviewers,mgaudet Also simplify the implementation to use normal string comparison instead of using character codes. Changes from: tc39/ecma262#3334 Differential Revision: https://phabricator.services.mozilla.com/D216275
…ffsets. r=spidermonkey-reviewers,mgaudet Also simplify the implementation to use normal string comparison instead of using character codes. Changes from: tc39/ecma262#3334 Differential Revision: https://phabricator.services.mozilla.com/D216275
Following ISO-8601, #2781 introduced U+2212 (Unicode minus) as an alias for the regular ASCII minus sign for use in time zone offsets.
There's two new data that lead me to believe that this was a mistake, and that we should revert this change.
The first is that the newly-released RFC 9557 (the string format standard that Temporal uses) disallows non-ASCII characters. Its predecessor RFC 3339 also disallows non-ASCII characters. So strings that follow the current (since 2022) ECMAScript spec could be rejected by RFC 9557 clients.
The second new data is feedback from implementers of a Rust version of Temporal that this single obscure character in the grammar will incur a performance cost because they must now use Rust strings instead of plain U8 ASCII data. See
tc39/proposal-temporal#2843 (comment)
This performance issue doesn't seem to be limited to Rust. Any native implementation would likely benefit from being able to know that valid date/time input (both
Date
and Temporal) is always ASCII-only.I don't know whether all engines have actually implemented this 2022 grammar change. But it's also a safe bet that real-world usage of this Unicode character is likely minimal. So the web-compat risk seems small.
If this PR is accepted, then I'll follow up with a normative Temporal PR to remove this character from Temporal as well.
FYI @ptomato @gibson042.