Skip to content

Regex match hangs on non-matching string #793

Open
@bgreenlee

Description

@bgreenlee

I ran across this while working on Advent of Code 2024, Day 19: https://adventofcode.com/2024/day/19

Here is code that demonstrates the problem:

let pattern = try! Regex("^(?:ggrru|ugu|gwgg|bwrw|bww|brg|brwu|ruugb|grggr|wrgbuug|bbbrbr|rgrrbrbw|gbwg|wuruug|gbgwbg|rgw|buu|ggbgb|rwg|gr|ggurggr|wruuwgrr|wbgg|gggrb|rgwuu|uuwww|bgrw|uuguubw|bbbrwu|ugurb|uwbggg|rurg|ubb|wrr|rbbbbg|gguuug|gbur|wb|bubbu|gbwru|bgg|ugg|bbrrg|wubr|bgwgbwgg|rguurb|bugu|wuww|urugr|bwb|wug|brr|u|rru|wwgbw|gwu|bw|ugrwggr|rgubuw|bbg|bwru|uwgbwu|gbrugg|rgub|rgbbuwwg|wwr|grw|rwggwwrw|bbbu|wr|wbwu|wwrbuu|rbbgwru|gur|buurr|ggbrg|gwg|wrg|urw|uubub|gwrgb|bbw|rrw|ugrurw|rubrw|bgb|bwgbwbw|guw|ur|wgrbu|bgu|rrrrbrw|uww|uuu|wuugbw|wwbugw|rwbr|ruwbr|uwu|wgrb|b|rrwugru|gwb|burw|rurb|rbrrbgu|uwgw|brubr|bwu|rbw|ugbu|gww|wwrb|wbgbrww|brrwgrg|rugug|grgrrb|wuubbgu|brub|rrwwwb|ugr|wbw|ruwgguu|wgw|rrwrg|bwbbrwbg|rggg|gbgrguw|rwgw|rbbgwbr|gub|rgrrg|wbgggu|bbbgww|ugb|rbbgr|wru|rubbuu|bggrbu|gbg|bgrgbb|wwrwugbg|rrgu|wrrubwu|wrbuu|rgug|bbu|wrww|wbb|wgrwu|bbrurru|wgrugwu|uuw|uggwg|rrbuwu|gruw|ubr|urgug|www|wgrwrrw|rruw|rbg|bbwu|brww|rbwbw|grgr|bgr|wgwwu|wur|gubu|rrubgg|wbrurbb|ugub|wrrr|gbbr|wwubu|uwwbuw|wuu|rgb|bbr|rbrbuwg|urwb|gg|grug|br|wwguuwb|wu|ruu|guuwrw|wgb|gbr|wgggug|rw|brggww|wrgrw|guggub|gbbrug|gbbrb|bugrg|bwr|gwwg|wwbbrggw|urwu|rgwr|rwb|rrub|ggb|wbwrrbw|wrub|wwg|ww|ggg|brugwrr|wbur|ubbbrw|uwugrg|buw|grg|rrr|bgwu|rbggg|rgr|wuwub|rg|guuwu|rrwwwrbr|rrg|gurwg|wburrwug|rwr|wbg|grwbr|bgwbwug|bwg|bru|rbwuug|gggg|gurb|bbb|ubwu|gugbr|buru|gbbg|brb|wgr|guwg|gurgrbug|wgwugwwu|uw|wrbbr|wgwugurr|uwww|urr|bgubug|bbgu|bbuugb|rwug|gurbb|bguw|ubbbub|bbwb|gugbgwb|bb|wrrg|ggr|wrbub|uu|wwbw|ub|uggw|ugbggrw|bur|uuuguru|bgwuu|gwr|uurrrw|rb|buwugrr|brwrrrgb|guu|bgur|wggur|wub|gbb|rr|wubw|uuwgww|bwurwur|gubgg|ubwg|grbr|rwgr|wrubw|grgwb|uuguu|rwgwb|bwrrgg|ubu|bwbwggrw|uwg|bgguu|wwu|grwgw|bwrr|uur|rwuww|bwbb|gwrr|gwgrww|rbgw|grub|wugu|grrwuwg|bburbb|wbgb|ubbr|gggu|wbu|rrbugrbu|gbbubrwg|gwwurw|grbb|gbugu|wguwrw|ubggbb|rbgbu|rwub|gb|wbrgr|wubwwb|brwg|bwwbur|gugwg|gru|rbrwurr|wrb|gwubug|ggbguu|bubg|uruub|wuw|gubbr|gu|uwrubg|wggrbug|uub|ggrb|rwu|ug|ugrggw|wgg|rrb|bbug|wrw|uubbbur|uwgb|bwgr|uru|bu|guuuguw|gwwu|uwb|uwwwb|w|rur|wguuw|guru|gwrrwwb|wwrww|rww|bbur|ruuwr|rbb|bugubgrr|ru|rbubwr|grr|bbrbu|gwww|uwwgw|wwwu|uwr|uguw|rbwrr|bguwg|rguw|guwgbbg|rub|rrur|bwwbgw|rrrg|bgggru|ggu|wrrggu|rug|uuburg|rbu|wggrbgb|brguurw|r|guwrr|bwbgrb|urwgbb|rwgur|urg|gbwrw|grb|gwburg|wbr|bwww|brwr|ugw|uwgu|uggwr|brw|wgu|gug|gbu|gwrur|ggbrb|rbubr|rrggurbw|rwrru|uug|urb|grbu|gbgug|bugg|gbw|rgu|rgurrw|gwbw|ggw|bgub|wwgugug|ugubwg|wuggr|ruw|rbr|bbwuwwgb|wg|ubw|rwwg|wggbr|urubb|wwb|bubr)+$")

let testStrings = [
    "ggrbbwbbwbuguwbuguwbbuwrbbbrwgurgwggbbwbguurb", // match
    "gruurruwgbwwrbbggwuwrwugwrguuwwrurugbbgubrurubrgwubg", // non-match
]

for testString in testStrings {
    print("\(testString):")
    print("  \(testString.wholeMatch(of: pattern) != nil)")
}

The first string, which is a match, returns quickly. The second hangs.

This is admittedly a pathological case, but for comparison, command-line grep can processes the entire input file of 400 strings in about 300 ms on my machine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions