Skip to content

Conversation

@yoff
Copy link
Contributor

@yoff yoff commented Aug 15, 2023

This is a PR on top of #13779 to aid in the discussion around parse mode characters.
I opted for no change-note here, as #13779 already contains one, and the other changes seem more internal.

re.compile(r'(?:.|\n)*b', re.DOTALL) # Has ReDoS.
re.compile(r'(?i)(?:.|\n)*b') # No ReDoS.
re.compile(r'(?s)(?:.|\n)*b') # Has ReDoS.
re.compile(r'(?is)(?:.|\n)*b') # Has ReDoS.
Copy link
Contributor

@erik-krogh erik-krogh Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add the following as a test:

re.compile(r'(?is)X(?:.|\n)*Y')

The message for that should ideally mention that the attack string should start with an X.
If the message doesn't say that, then the "start" of the regex is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent suggestion, thanks! The message does not mention X, I will look into why..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that (?is) is still considered a group, but with no children.
It shouldn't be a group, it shouldn't appear as a RegExpTerm at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FirstItem does find the X..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe as a test revert this commit: e2de0e6
That allows you to view the tree structure of the parsed regular expression in the ast-viewer in VSCode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..but NfaUtils::prefix does not..

| unittests.py:9:16:9:24 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. |
| unittests.py:11:20:11:28 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. |
| unittests.py:12:21:12:29 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. |
| unittests.py:13:22:13:30 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings starting with 'x' and containing many repetitions of '\\n'. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love how the regex uses an upper-case X, but the alert-message uses a lower-case x (because it's case-insensitive).

Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test outcomes look good to me now 👍

I can't really comment on the parser architecture.

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

@yoff yoff force-pushed the python/parsemodechars-not-chars branch from ee70643 to d3c24ba Compare August 24, 2023 19:22
- make qldoc accurate
- fix ql4ql alert
@yoff yoff marked this pull request as ready for review August 24, 2023 19:29
@yoff yoff requested a review from a team as a code owner August 24, 2023 19:29
erik-krogh
erik-krogh previously approved these changes Aug 24, 2023
Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NfaUtils, the tests, and the test outcomes look good 👍

I'll leave the regex parser to someone that has more experience there.

@geoffw0
Copy link
Contributor

geoffw0 commented Aug 24, 2023

I'm happy with this - and I intend to port it back for Swift. I'm not on the Python team though, so I probably shouldn't have the final word here.

You do now have test failures that appear (at a glance) to be repairing the unintended side-effects of #13779 . 🎉

@yoff yoff added the no-change-note-required This PR does not need a change note label Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants