Skip to content

Conversation

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Sep 13, 2023

Ports #13975 to Swift. This fixes an issue where mode flag group characters (e.g. the i and s in (?is)) were considered as ordinary matches against the input string in addition to setting mode flags.

The message for the test result on ReDoS.swift:73 demonstrates that we no longer conflate these.

@geoffw0 geoffw0 added the Swift label Sep 13, 2023
@geoffw0 geoffw0 requested a review from a team as a code owner September 13, 2023 17:27
@geoffw0
Copy link
Contributor Author

geoffw0 commented Sep 19, 2023

Fixed the test failure (lots of previously not detected results for the bad tag filter query are now detected).

this.flagGroupStartNoModes(start, pos)
or
this.modeCharacter(start, pos - 1) and
this.getChar(pos) in ["i", "x", "s", "m", "w"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any official list of parse modes, but Bing Chat says there's also a "u" for Unicode (citing https://developer.apple.com/documentation/swift/regex, which doesn't list any flags).

Where did the list in this PR come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately regex parsers vary slightly, and Bing Chat appears to be ignoring this and giving generic advice.

The list comes from the NSRegularExpression doc (search for (?ismwx-ismwx: ... )), though I made an assumption that Regex works exactly the same way despite, as you noted, the doc not saying.

I've also verified Swift behaviour with a Swift interpreter, where u is not supported and it appears w is not yet supported either - but in the case of w it is at least clear what it should / will mean in a Swift regex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list comes from the NSRegularExpression doc (search for (?ismwx-ismwx: ... )), though I made an assumption that Regex works exactly the same way despite, as you noted, the doc not saying.

I think this might be good to note in a comment for future maintainers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: upon further investigation it appears the situation is more complicated, there are some more flags and differences between Regex and NSRegularExpression. I'll add a documenting comment and expand the set of flags we accept.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code.

sashabu
sashabu previously approved these changes Sep 25, 2023
Copy link
Contributor

@sashabu sashabu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than the nits (which aren't blocking, but would still be good to have in this PR).

sashabu
sashabu previously approved these changes Sep 26, 2023
Copy link
Contributor

@sashabu sashabu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "w" going from "UNICODE" -> "UNICODEBOUNDARY" (and the corresponding MkUnicode -> MkUnicodeBoundary change) need a change note? If those names are for internal use only, the PR LGTM.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Sep 26, 2023

Yeah, that's part of the API - not a part I expect many people to use, but I should cover it in the change note...

@geoffw0
Copy link
Contributor Author

geoffw0 commented Sep 26, 2023

Updated change note. I appreciate the nit picking by the way, it's good to get this stuff right!

Copy link
Contributor

@sashabu sashabu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the diligence! LGTM.

@geoffw0 geoffw0 merged commit 49d47a3 into github:main Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants