-
Notifications
You must be signed in to change notification settings - Fork 494
Description
What version of regex are you using?
1.12.x and also older
Describe the bug at a high level.
See title: "Question: Do PCRE2 leftmost-first semantics include capture groups?"
I've been testing regex implementations for differences in capture behavior because I'm trying to figure out how to best handle tie-breaking in a lockstep parallel NFA simulation. I'm running into some strange differences from PCRE2 in automata-driven crates and can't figure out if they would be considered bugs worth reporting or not. If not then I want to avoid dropping a ton of supposed bugs on here for no reason. I vaguely remember from working on my own regex implementation that handling quantified nullable groups was a headache even in backtracking land.
Random example: On the regex (|.)*(a+b) (yes really) with the input axaaab, everything successfully matches the entire input string, but rust/regex and re2 give capture groups of axa[a][ab], while PCRE2 and C# etc give ax[][aaab].