-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30285: Optimize case-insensitive matching and searching #1482
Conversation
of regular expressions.
@serhiy-storchaka, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rhettinger, @Yhg1s and @niemeyer to be potential reviewers. |
Lib/sre_compile.py
Outdated
@@ -268,7 +280,7 @@ def _optimize_charset(charset, fixup, fixes): | |||
else: | |||
charmap[av] = 1 | |||
elif op is RANGE: | |||
r = range(av[0], av[1]+1) | |||
r = r0 = range(av[0], av[1]+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is r0
used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At line 296.
if fixup and not hascased:
hascased = any(map(iscased, r0))
This intentionally moved after the loops that set bits in charmap
because for large ranges like range(0x20000, 0x110000)
it is better to get an IndexError earlier (the size of charmap
is 0x100 or 0x10000) than burn CPU in long useless loop.
Lib/sre_compile.py
Outdated
# internal: optimize character set | ||
out = [] | ||
tail = [] | ||
charmap = bytearray(256) | ||
hascased = False | ||
if fixup: | ||
if flags & SRE_FLAG_UNICODE and not flags & SRE_FLAG_ASCII: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the if/else statement from 264-267 be replaced with:
iscased = _get_iscased(flags)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
of regular expressions.