-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Python: parse mode chars should not be considered chars #13975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| re.compile(r'(?:.|\n)*b', re.DOTALL) # Has ReDoS. | ||
| re.compile(r'(?i)(?:.|\n)*b') # No ReDoS. | ||
| re.compile(r'(?s)(?:.|\n)*b') # Has ReDoS. | ||
| re.compile(r'(?is)(?:.|\n)*b') # Has ReDoS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add the following as a test:
re.compile(r'(?is)X(?:.|\n)*Y')The message for that should ideally mention that the attack string should start with an X.
If the message doesn't say that, then the "start" of the regex is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent suggestion, thanks! The message does not mention X, I will look into why..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that (?is) is still considered a group, but with no children.
It shouldn't be a group, it shouldn't appear as a RegExpTerm at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FirstItem does find the X..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe as a test revert this commit: e2de0e6
That allows you to view the tree structure of the parsed regular expression in the ast-viewer in VSCode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..but NfaUtils::prefix does not..
| | unittests.py:9:16:9:24 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | ||
| | unittests.py:11:20:11:28 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | ||
| | unittests.py:12:21:12:29 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\n'. | | ||
| | unittests.py:13:22:13:30 | (?:.\|\\n)* | This part of the regular expression may cause exponential backtracking on strings starting with 'x' and containing many repetitions of '\\n'. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how the regex uses an upper-case X, but the alert-message uses a lower-case x (because it's case-insensitive).
erik-krogh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test outcomes look good to me now 👍
I can't really comment on the parser architecture.
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
They are simply considered part of the group start.
ee70643 to
d3c24ba
Compare
- make qldoc accurate - fix ql4ql alert
erik-krogh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NfaUtils, the tests, and the test outcomes look good 👍
I'll leave the regex parser to someone that has more experience there.
|
I'm happy with this - and I intend to port it back for Swift. I'm not on the Python team though, so I probably shouldn't have the final word here. You do now have test failures that appear (at a glance) to be repairing the unintended side-effects of #13779 . 🎉 |
This is a PR on top of #13779 to aid in the discussion around parse mode characters.
I opted for no change-note here, as #13779 already contains one, and the other changes seem more internal.