Description
Discussed in #890
Originally posted by Guillermogsjc July 19, 2022
Hi, to match efficiently large amounts of alternations, I guess it is interesting to trigger aho_corasick
variant here
Line 91 in 9ca3099
/// This is only set when the entire regex is a simple unanchored
/// alternation of literals. We could probably use it more circumstances,
/// but this is already hacky enough in this architecture.
The question is: is there any way to use word boundaries in such a way this expression is highly optimized for a thing like this?
r"\b(a|... #massive ammount of literal alternations here# ...|z)\b"
or with (?-u:\b)
instead of \b
.
And... regarding PERFORMANCE documentation here
there is no problem with using non-greedy matching or having lots of alternations in your regex
this previously stated regex would be in the set of "no problem" ?
Thanks