-
Notifications
You must be signed in to change notification settings - Fork 2k
Simplified number regexes #5430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7278b11 to
41afdba
Compare
GeoffreyBooth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Looks good, and nice catch on the BigInt decimals. I still prefer my regex I wrote a small benchmark that confirms my intuition that my regex is a bit more performant, especially for larger strings. |
|
Does |
|
@GeoffreyBooth that regex works for most places but causes catastrophic backtracking when added to BigInt. I think since the @edemaine https://jsbench.me/j7l9c2s0zl/1 it seems to only be faster for the SIMPLENUM test so maybe I'll update again with just that. |
|
Fascinating! That I benchmarked among (I did double-check the numeric separator spec to confirm that double underscore isn't allowed, and these are indeed the correct regexes.) |
|
From my experience catastrophic backtracking can be avoided with a negative lookahead |
|
From my testing changing the regex here: https://github.com/jashkenas/coffeescript/pull/5430/files#diff-b1724eb7967924ebd01486bf963f8281f5fff1acdf6f9354124c9ee50f0addcdR1313 caused the tests to never finish (I gave up after a few minutes when usually they only take six seconds). Changing the regexps on lines 1314 and 1315 seemed fine though. Please test it out on your local and let me know what you find in case I may be gravely mistaken. |
|
Oh I see. It should probably be non-greedy: - \d(?:_?\d)*n
+ \d(?:_?\d)*?nThe difference here is that the first one starts testing from the end of the input, the second one from the start. |
|
I tried some more testing but non-greedy didn't seem to help with the BigInt backtracking changing to I'm happy with this as it is currently stands since it is close to the minimal amount of change and the performance seems in line with the existing compiler. |
|
@STRd6 I confirmed that changing that line to use But my suggestion was |
|
@edemaine Thanks for clarifying, I've updated the regexes to match and I think this is good for now. |
|
Okay, so can everyone on this thread please confirm that the regex in this PR is the one we want to go with? 👍 or 👎 |
#5429 (comment)
I updated the
SIMPLENUMregex insrc/nodes.coffeeand also the decimal parts of the number regex in lexer.coffee.Also added a fix for numeric separators in decimal big int literals while I was in there.