You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I suspect that '' instead of '\x01' should be returned for a negative lookaround.
A negative lookaround is a restrictive rule and does not add a possible match. We can expect that a URL matching a rule /\.foo(?!bar\.baz)bar\./ must match /\.foobar\./ and thus must contain foobar, and foobar doesn't need to be excluded during token selection.
The text was updated successfully, but these errors were encountered:
danny0838
changed the title
Allow token selection for a regex rule around a negative lookaround
Allow token selection around a negative lookaround for a regex rule
Nov 16, 2022
gorhill
added a commit
to gorhill/uBlock
that referenced
this issue
Nov 17, 2022
Fixed flawed extraction of tokens with optional sequences, i.e.
when quantifier could be zero.
Related issue:
- uBlockOrigin/uBlock-issues#2367
Ignore look-around sequences as suggested when normalizing into
tokenizable string.
Related issue:
- uBlockOrigin/uBlock-issues#2368
Fix regex analyzer throwing with trailing `-` in character
class sequence.
Related issue:
- AdguardTeam/AdguardFilters#134630
I think positive lookarounds may (maybe "should") be treated differently, as it sometimes works like \b to mark a word separator, and may be used with a back reference to mimic an atomic group like (?=(regex))\1.
Actually I used to think the original version of treating is intentional as it works well for something like abc(?=def). Although it is flawed for something like abc(?=def)(?=ghi), it's not likely to happen in real world regexes.
Maybe we can do something like handling quantifiers to insert optional \x00 or \x01.
gorhill
added a commit
to gorhill/uBlock
that referenced
this issue
Nov 17, 2022
Related code: https://github.com/gorhill/uBlock/blob/2204451514f1a894a7627793a253a89ebbc6d845/src/js/static-filtering-parser.js#L3044-L3048
I suspect that
''
instead of'\x01'
should be returned for a negative lookaround.A negative lookaround is a restrictive rule and does not add a possible match. We can expect that a URL matching a rule
/\.foo(?!bar\.baz)bar\./
must match/\.foobar\./
and thus must containfoobar
, andfoobar
doesn't need to be excluded during token selection.The text was updated successfully, but these errors were encountered: