-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Swift: Port regex mode flag fix from Python to Swift #14209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Fixed the test failure (lots of previously not detected results for the bad tag filter query are now detected). |
| this.flagGroupStartNoModes(start, pos) | ||
| or | ||
| this.modeCharacter(start, pos - 1) and | ||
| this.getChar(pos) in ["i", "x", "s", "m", "w"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find any official list of parse modes, but Bing Chat says there's also a "u" for Unicode (citing https://developer.apple.com/documentation/swift/regex, which doesn't list any flags).
Where did the list in this PR come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately regex parsers vary slightly, and Bing Chat appears to be ignoring this and giving generic advice.
The list comes from the NSRegularExpression doc (search for (?ismwx-ismwx: ... )), though I made an assumption that Regex works exactly the same way despite, as you noted, the doc not saying.
I've also verified Swift behaviour with a Swift interpreter, where u is not supported and it appears w is not yet supported either - but in the case of w it is at least clear what it should / will mean in a Swift regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list comes from the
NSRegularExpressiondoc (search for(?ismwx-ismwx: ... )), though I made an assumption thatRegexworks exactly the same way despite, as you noted, the doc not saying.
I think this might be good to note in a comment for future maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: upon further investigation it appears the situation is more complicated, there are some more flags and differences between Regex and NSRegularExpression. I'll add a documenting comment and expand the set of flags we accept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code.
sashabu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than the nits (which aren't blocking, but would still be good to have in this PR).
sashabu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "w" going from "UNICODE" -> "UNICODEBOUNDARY" (and the corresponding MkUnicode -> MkUnicodeBoundary change) need a change note? If those names are for internal use only, the PR LGTM.
|
Yeah, that's part of the API - not a part I expect many people to use, but I should cover it in the change note... |
|
Updated change note. I appreciate the nit picking by the way, it's good to get this stuff right! |
sashabu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the diligence! LGTM.
Ports #13975 to Swift. This fixes an issue where mode flag group characters (e.g. the
iandsin(?is)) were considered as ordinary matches against the input string in addition to setting mode flags.The message for the test result on
ReDoS.swift:73demonstrates that we no longer conflate these.