-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Unicode support for identifier names #1003
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1003 +/- ##
==========================================
+ Coverage 58.30% 58.37% +0.07%
==========================================
Files 172 173 +1
Lines 12008 12050 +42
==========================================
+ Hits 7001 7034 +33
- Misses 5007 5016 +9
Continue to review full report at Codecov.
|
Test262 conformance changes:
Numbers look good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking pretty good. I added a couple of comments :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much, much cleaner now. I'm only missing a bit of documentation, but for the rest, it's good to go from my side :)
@Razican I'm thinking to check the Unicode version of Where do you think we can add this check? |
b3646a3
to
0e7bf9e
Compare
This could be checked either in a |
It won't work. |
You can also add it as a "build" dependency and check it, but it's true that those versions can get out of sync. Could we maybe add a test in |
0856405
to
407ebc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, this looks very good to me now :)
This PR implements the complete syntax (except UnicodeEscapeSequence) of "Names and Keywords", also known as an identifier, in the ECMAScript spec. All the identifier name related tests (except the escaped ones) in test262 are passed in this PR.
The implementation is based on Unicode® Standard Annex with Unicode version 13.0.0. The crate
unicode-general-category
v0.3.0 is added to lookup character categories, and of course it implements Unicode v13.0.0. Sad to say, there are no existing update-to-date crates that implement the identifier properties lookup functions we need, so I end up building tables in this PR.The definition of UnicodeEscapeSequence in identifier name similar to the Unicode escape in string literal. I will work on that with a minor refactor to see if we can reuse it in both parts after this PR is merged.