Skip to content

Commit 1bb1ef6

Browse files
authored
no partial match if trailing data is invalid utf (PCRE2Project#238)
Avoid returning a partial match if one was found but followed by invalid UTF, making the result consistent with JIT and unlike: PCRE2 version 10.34 2019-11-21 re> /.a/match_invalid_utf,allvector,jit data> b\xb1\=ph,ovector=1 No match 0: <unchanged> data> b\xb1\=ph,ovector=1,no_jit Partial match: b\x{b1} ** ovector[1] is not equal to the subject length: 1 != 2 0: 0 1
1 parent 15a11d1 commit 1bb1ef6

File tree

6 files changed

+110
-0
lines changed

6 files changed

+110
-0
lines changed

src/pcre2_match.c

+1
Original file line numberDiff line numberDiff line change
@@ -7454,6 +7454,7 @@ if (utf && end_subject != true_end_subject &&
74547454
if (start_match >= true_end_subject)
74557455
{
74567456
rc = MATCH_NOMATCH; /* In case it was partial */
7457+
match_partial = NULL;
74577458
break;
74587459
}
74597460

testdata/testinput10

+19
Original file line numberDiff line numberDiff line change
@@ -506,6 +506,25 @@
506506
\= Expect no match
507507
ab\x80cdef\=ph
508508

509+
/.a/match_invalid_utf
510+
ab\=ph
511+
ab\=ps
512+
b\xf0\x91\x88b\=ph
513+
b\xf0\x91\x88b\=ps
514+
b\xf0\x91\x88\xb4a
515+
\= Expect no match
516+
b\x80\=ph
517+
b\x80\=ps
518+
b\xf0\x91\x88\=ph
519+
b\xf0\x91\x88\=ps
520+
521+
/.a$/match_invalid_utf
522+
ab\=ph
523+
ab\=ps
524+
\= Expect no match
525+
b\xf0\x91\x98\=ph
526+
b\xf0\x91\x98\=ps
527+
509528
/ab$/match_invalid_utf
510529
ab\x80cdeab
511530
\= Expect no match

testdata/testinput12

+14
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,20 @@
413413
\= Expect no match
414414
ab\x{df00}cdef\=ph
415415

416+
/.a/match_invalid_utf
417+
ab\=ph
418+
ab\=ps
419+
\= Expect no match
420+
b\x{df00}\=ph
421+
b\x{df00}\=ps
422+
423+
/.a$/match_invalid_utf
424+
ab\=ph
425+
ab\=ps
426+
\= Expect no match
427+
b\x{df00}\=ph
428+
b\x{df00}\=ps
429+
416430
/ab$/match_invalid_utf
417431
ab\x{df00}cdeab
418432
\= Expect no match

testdata/testoutput10

+32
Original file line numberDiff line numberDiff line change
@@ -1646,6 +1646,38 @@ Partial match: ab
16461646
ab\x80cdef\=ph
16471647
No match
16481648

1649+
/.a/match_invalid_utf
1650+
ab\=ph
1651+
Partial match: b
1652+
ab\=ps
1653+
Partial match: b
1654+
b\xf0\x91\x88b\=ph
1655+
Partial match: b
1656+
b\xf0\x91\x88b\=ps
1657+
Partial match: b
1658+
b\xf0\x91\x88\xb4a
1659+
0: \x{11234}a
1660+
\= Expect no match
1661+
b\x80\=ph
1662+
No match
1663+
b\x80\=ps
1664+
No match
1665+
b\xf0\x91\x88\=ph
1666+
No match
1667+
b\xf0\x91\x88\=ps
1668+
No match
1669+
1670+
/.a$/match_invalid_utf
1671+
ab\=ph
1672+
Partial match: b
1673+
ab\=ps
1674+
Partial match: b
1675+
\= Expect no match
1676+
b\xf0\x91\x98\=ph
1677+
No match
1678+
b\xf0\x91\x98\=ps
1679+
No match
1680+
16491681
/ab$/match_invalid_utf
16501682
ab\x80cdeab
16511683
0: ab

testdata/testoutput12-16

+22
Original file line numberDiff line numberDiff line change
@@ -1522,6 +1522,28 @@ Partial match: ab
15221522
ab\x{df00}cdef\=ph
15231523
No match
15241524

1525+
/.a/match_invalid_utf
1526+
ab\=ph
1527+
Partial match: b
1528+
ab\=ps
1529+
Partial match: b
1530+
\= Expect no match
1531+
b\x{df00}\=ph
1532+
No match
1533+
b\x{df00}\=ps
1534+
No match
1535+
1536+
/.a$/match_invalid_utf
1537+
ab\=ph
1538+
Partial match: b
1539+
ab\=ps
1540+
Partial match: b
1541+
\= Expect no match
1542+
b\x{df00}\=ph
1543+
No match
1544+
b\x{df00}\=ps
1545+
No match
1546+
15251547
/ab$/match_invalid_utf
15261548
ab\x{df00}cdeab
15271549
0: ab

testdata/testoutput12-32

+22
Original file line numberDiff line numberDiff line change
@@ -1520,6 +1520,28 @@ Partial match: ab
15201520
ab\x{df00}cdef\=ph
15211521
No match
15221522

1523+
/.a/match_invalid_utf
1524+
ab\=ph
1525+
Partial match: b
1526+
ab\=ps
1527+
Partial match: b
1528+
\= Expect no match
1529+
b\x{df00}\=ph
1530+
No match
1531+
b\x{df00}\=ps
1532+
No match
1533+
1534+
/.a$/match_invalid_utf
1535+
ab\=ph
1536+
Partial match: b
1537+
ab\=ps
1538+
Partial match: b
1539+
\= Expect no match
1540+
b\x{df00}\=ph
1541+
No match
1542+
b\x{df00}\=ps
1543+
No match
1544+
15231545
/ab$/match_invalid_utf
15241546
ab\x{df00}cdeab
15251547
0: ab

0 commit comments

Comments
 (0)