Skip to content

Commit 6454934

Browse files
authored
avoid inconsistency between \d and [:digit:] when using /a (PCRE2Project#223)
Since a608946 (Additional PCRE2_EXTRA_ASCII_xxx code, 2023-02-01) PCRE2_EXTRA_ASCII_BSD could be used to restrict \d to ASCII causing the following inconsistent behaviour in UCP mode. PCRE2 version 10.43-DEV 2023-01-15 re> /\d/utf,ucp,ascii_bsd data> ٣ No match data> re> /[[:digit:]]/utf,ucp,ascii_bsd data> ٣ 0: \x{663} It has been suggested[1] that the change to match \p{Nd} when Unicode is enabled for [:digit:] might had been unintentional and a bug, as [:digit:] should be able to be POSIX compatible, so add a new flag PCRE2_EXTRA_ASCII_DIGIT to avoid changing its definition in UCP mode. [1] https://lore.kernel.org/git/CANgJU+U+xXsh9psd0z5Xjr+Se5QgdKkjQ7LUQ-PdUULSN3n4+g@mail.gmail.com/
1 parent 512be06 commit 6454934

19 files changed

+105
-35
lines changed

ChangeLog

+11-9
Original file line numberDiff line numberDiff line change
@@ -55,23 +55,25 @@ change needed for 9(a) above; (b) fix bugs in ucptest,
5555

5656
12. Integer overflow testing is now centralized in a new function.
5757

58-
13. Made PCRE2_UCP the default in UTF mode in pcre2grep, and added new options
58+
13. Made PCRE2_UCP the default in UTF mode in pcre2grep, and added new options
5959
--case-restrict and --no-ucp.
6060

61-
14. In the debugging printint module (which is normally only linked into
62-
pcre2test), avoid the use of a variable called "not" because that's deprecated
63-
in C and forbidden in C++. Also rewrite some code to avoid a goto into a block
61+
14. In the debugging printint module (which is normally only linked into
62+
pcre2test), avoid the use of a variable called "not" because that's deprecated
63+
in C and forbidden in C++. Also rewrite some code to avoid a goto into a block
6464
that bypassed its initialization (though it didn't actually matter).
6565

66-
15. More minor code adjustments to avoid using reserved C++ words as variable
67-
names ("new" and "typename") and another jump that bypassed an (irrelevant)
66+
15. More minor code adjustments to avoid using reserved C++ words as variable
67+
names ("new" and "typename") and another jump that bypassed an (irrelevant)
6868
initialization.
6969

70-
16. Merged a pull request that removed pcre2_ucptables.c from the list of files
71-
to compile in NON-AUTOTOOLS-BUILD because it is #included in pcre2_tables.c.
72-
Also adjusted the BUILD.bazel and build.zig files, which had the same issue. At
70+
16. Merged a pull request that removed pcre2_ucptables.c from the list of files
71+
to compile in NON-AUTOTOOLS-BUILD because it is #included in pcre2_tables.c.
72+
Also adjusted the BUILD.bazel and build.zig files, which had the same issue. At
7373
the same time, fixed a typo in the Bazel file.
7474

75+
17. Add PCRE2_EXTRA_ASCII_DIGIT to allow [:digit:] to be kept on sync with \d
76+
even in UCP mode.
7577

7678
Version 10.42 11-December-2022
7779
------------------------------

doc/html/pcre2_set_compile_extra_options.html

+4-3
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,11 @@ <h1>pcre2_set_compile_extra_options man page</h1>
3535
PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling
3636
PCRE2_EXTRA_ASCII_BSD \d remains ASCII in UCP mode
3737
PCRE2_EXTRA_ASCII_BSS \s remains ASCII in UCP mode
38-
PCRE2_EXTRA_ASCII_BSW \w remains ASFII in UCP mode
39-
PCRE2_EXTRA_ASCII_POSIX POSIX classes remain ASCII in UCP mode
38+
PCRE2_EXTRA_ASCII_BSW \w remains ASCII in UCP mode
39+
PCRE2_EXTRA_ASCII_DIGIT [:digit:] POSIX class remains ASCII in UCP mode
40+
PCRE2_EXTRA_ASCII_POSIX POSIX classes remain ASCII in UCP mode
4041
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
41-
PCRE2_EXTRA_CASELESS_RESTRICT Disable mixed ASCII/non-ASCII case folding
42+
PCRE2_EXTRA_CASELESS_RESTRICT Disable mixed ASCII/non-ASCII case folding
4243
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n
4344
PCRE2_EXTRA_MATCH_LINE Pattern matches whole lines
4445
PCRE2_EXTRA_MATCH_WORD Pattern matches "words"

doc/html/pcre2api.html

+9-4
Original file line numberDiff line numberDiff line change
@@ -1540,7 +1540,7 @@ <h1>pcre2api man page</h1>
15401540
one other case, and for all characters whose code points are greater than
15411541
U+007F. Note that there are two ASCII characters, K and S, that, in addition to
15421542
their lower case ASCII equivalents, are case-equivalent with U+212A (Kelvin
1543-
sign) and U+017F (long S) respectively. If you do not want this case
1543+
sign) and U+017F (long S) respectively. If you do not want this case
15441544
equivalence, you can suppress it by setting PCRE2_EXTRA_CASELESS_RESTRICT.
15451545
</P>
15461546
<P>
@@ -1887,7 +1887,7 @@ <h1>pcre2api man page</h1>
18871887
This option has two effects. Firstly, it change the way PCRE2 processes \B,
18881888
\b, \D, \d, \S, \s, \W, \w, and some of the POSIX character classes. By
18891889
default, only ASCII characters are recognized, but if PCRE2_UCP is set, Unicode
1890-
properties are used to classify characters. There are some PCRE2_EXTRA
1890+
properties are used to classify characters. There are some PCRE2_EXTRA
18911891
options (see below) that add finer control to this behaviour. More details are
18921892
given in the section on
18931893
<a href="pcre2pattern.html#genericchartypes">generic character types</a>
@@ -1994,6 +1994,11 @@ <h1>pcre2api man page</h1>
19941994
This option forces \w to match only ASCII word characters, even when PCRE2_UCP
19951995
is set. It can be changed within a pattern by means of the (?aW) option
19961996
setting.
1997+
<pre>
1998+
PCRE2_EXTRA_ASCII_DIGIT
1999+
</pre>
2000+
This option forces the POSIX character class [:digit:] to match only ASCII
2001+
digits, even when PCRE2_UCP is set.
19972002
<pre>
19982003
PCRE2_EXTRA_ASCII_POSIX
19992004
</pre>
@@ -2029,8 +2034,8 @@ <h1>pcre2api man page</h1>
20292034
case-equivalent character sets that contain both ASCII and non-ASCII
20302035
characters. The ASCII letter S is case-equivalent to U+017f (long S) and the
20312036
ASCII letter K is case-equivalent to U+212a (Kelvin sign). This option disables
2032-
recognition of case-equivalences that cross the ASCII/non-ASCII boundary. In a
2033-
caseless match, both characters must either be ASCII or non-ASCII. The option
2037+
recognition of case-equivalences that cross the ASCII/non-ASCII boundary. In a
2038+
caseless match, both characters must either be ASCII or non-ASCII. The option
20342039
can be changed with a pattern by the (?r) option setting.
20352040
<pre>
20362041
PCRE2_EXTRA_ESCAPED_CR_IS_LF

doc/html/pcre2pattern.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -1526,7 +1526,7 @@ <h1>pcre2pattern man page</h1>
15261526
[:alpha:] becomes \p{L}
15271527
[:blank:] becomes \h
15281528
[:cntrl:] becomes \p{Cc}
1529-
[:digit:] becomes \p{Nd}
1529+
[:digit:] becomes \p{Nd} unless PCRE2_EXTRA_ASCII_DIGIT is set
15301530
[:lower:] becomes \p{Ll}
15311531
[:space:] becomes \p{Xps}
15321532
[:upper:] becomes \p{Lu}

doc/html/pcre2test.html

+1
Original file line numberDiff line numberDiff line change
@@ -631,6 +631,7 @@ <h1>pcre2test man page</h1>
631631
ascii_bsd set PCRE2_EXTRA_ASCII_BSD
632632
ascii_bss set PCRE2_EXTRA_ASCII_BSS
633633
ascii_bsw set PCRE2_EXTRA_ASCII_BSW
634+
ascii_digit set PCRE2_EXTRA_ASCII_DIGIT
634635
ascii_posix set PCRE2_EXTRA_ASCII_POSIX
635636
auto_callout set PCRE2_AUTO_CALLOUT
636637
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL

doc/pcre2.txt

+6-1
Original file line numberDiff line numberDiff line change
@@ -1953,6 +1953,11 @@ COMPILING A PATTERN
19531953
PCRE2_UCP is set. It can be changed within a pattern by means of the
19541954
(?aW) option setting.
19551955

1956+
PCRE2_EXTRA_ASCII_DIGIT
1957+
1958+
This option forces the POSIX character class [:digit:] to match only
1959+
ASCII digits, even when PCRE2_UCP is set.
1960+
19561961
PCRE2_EXTRA_ASCII_POSIX
19571962

19581963
This option forces the POSIX character classes to match only ASCII
@@ -7688,7 +7693,7 @@ POSIX CHARACTER CLASSES
76887693
[:alpha:] becomes \p{L}
76897694
[:blank:] becomes \h
76907695
[:cntrl:] becomes \p{Cc}
7691-
[:digit:] becomes \p{Nd}
7696+
[:digit:] becomes \p{Nd} unless PCRE2_EXTRA_ASCII_DIGIT is set
76927697
[:lower:] becomes \p{Ll}
76937698
[:space:] becomes \p{Xps}
76947699
[:upper:] becomes \p{Lu}

doc/pcre2_set_compile_extra_options.3

+6-3
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,18 @@ options are:
2727
\ex handling
2828
PCRE2_EXTRA_ASCII_BSD \ed remains ASCII in UCP mode
2929
PCRE2_EXTRA_ASCII_BSS \es remains ASCII in UCP mode
30-
PCRE2_EXTRA_ASCII_BSW \ew remains ASFII in UCP mode
30+
PCRE2_EXTRA_ASCII_BSW \ew remains ASCII in UCP mode
31+
.\" JOIN
32+
PCRE2_EXTRA_ASCII_DIGIT [:digit:] POSIX class remains ASCII
33+
in UCP mode
3134
.\" JOIN
3235
PCRE2_EXTRA_ASCII_POSIX POSIX classes remain ASCII in
33-
UCP mode
36+
UCP mode
3437
.\" JOIN
3538
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as
3639
a literal following character
3740
.\" JOIN
38-
PCRE2_EXTRA_CASELESS_RESTRICT Disable mixed ASCII/non-ASCII
41+
PCRE2_EXTRA_CASELESS_RESTRICT Disable mixed ASCII/non-ASCII
3942
case folding
4043
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \er as \en
4144
PCRE2_EXTRA_MATCH_LINE Pattern matches whole lines

doc/pcre2api.3

+9-4
Original file line numberDiff line numberDiff line change
@@ -1482,7 +1482,7 @@ PCRE2_UCP is set, Unicode properties are used for all characters with more than
14821482
one other case, and for all characters whose code points are greater than
14831483
U+007F. Note that there are two ASCII characters, K and S, that, in addition to
14841484
their lower case ASCII equivalents, are case-equivalent with U+212A (Kelvin
1485-
sign) and U+017F (long S) respectively. If you do not want this case
1485+
sign) and U+017F (long S) respectively. If you do not want this case
14861486
equivalence, you can suppress it by setting PCRE2_EXTRA_CASELESS_RESTRICT.
14871487
.P
14881488
For lower valued characters with only one other case, a lookup table is used
@@ -1838,7 +1838,7 @@ are not representable in UTF-16.
18381838
This option has two effects. Firstly, it change the way PCRE2 processes \eB,
18391839
\eb, \eD, \ed, \eS, \es, \eW, \ew, and some of the POSIX character classes. By
18401840
default, only ASCII characters are recognized, but if PCRE2_UCP is set, Unicode
1841-
properties are used to classify characters. There are some PCRE2_EXTRA
1841+
properties are used to classify characters. There are some PCRE2_EXTRA
18421842
options (see below) that add finer control to this behaviour. More details are
18431843
given in the section on
18441844
.\" HTML <a href="pcre2pattern.html#genericchartypes">
@@ -1953,6 +1953,11 @@ option setting.
19531953
This option forces \ew to match only ASCII word characters, even when PCRE2_UCP
19541954
is set. It can be changed within a pattern by means of the (?aW) option
19551955
setting.
1956+
.sp
1957+
PCRE2_EXTRA_ASCII_DIGIT
1958+
.sp
1959+
This option forces the POSIX character class [:digit:] to match only ASCII
1960+
digits, even when PCRE2_UCP is set.
19561961
.sp
19571962
PCRE2_EXTRA_ASCII_POSIX
19581963
.sp
@@ -1987,8 +1992,8 @@ rules, which allow for more than two cases per character. There are two
19871992
case-equivalent character sets that contain both ASCII and non-ASCII
19881993
characters. The ASCII letter S is case-equivalent to U+017f (long S) and the
19891994
ASCII letter K is case-equivalent to U+212a (Kelvin sign). This option disables
1990-
recognition of case-equivalences that cross the ASCII/non-ASCII boundary. In a
1991-
caseless match, both characters must either be ASCII or non-ASCII. The option
1995+
recognition of case-equivalences that cross the ASCII/non-ASCII boundary. In a
1996+
caseless match, both characters must either be ASCII or non-ASCII. The option
19921997
can be changed with a pattern by the (?r) option setting.
19931998
.sp
19941999
PCRE2_EXTRA_ESCAPED_CR_IS_LF

doc/pcre2pattern.3

+1-1
Original file line numberDiff line numberDiff line change
@@ -1522,7 +1522,7 @@ classes with other sequences, as follows:
15221522
[:alpha:] becomes \ep{L}
15231523
[:blank:] becomes \eh
15241524
[:cntrl:] becomes \ep{Cc}
1525-
[:digit:] becomes \ep{Nd}
1525+
[:digit:] becomes \ep{Nd} unless PCRE2_EXTRA_ASCII_DIGIT is set
15261526
[:lower:] becomes \ep{Ll}
15271527
[:space:] becomes \ep{Xps}
15281528
[:upper:] becomes \ep{Lu}

doc/pcre2test.1

+1
Original file line numberDiff line numberDiff line change
@@ -586,6 +586,7 @@ for a description of the effects of these options.
586586
ascii_bsd set PCRE2_EXTRA_ASCII_BSD
587587
ascii_bss set PCRE2_EXTRA_ASCII_BSS
588588
ascii_bsw set PCRE2_EXTRA_ASCII_BSW
589+
ascii_digit set PCRE2_EXTRA_ASCII_DIGIT
589590
ascii_posix set PCRE2_EXTRA_ASCII_POSIX
590591
auto_callout set PCRE2_AUTO_CALLOUT
591592
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL

doc/pcre2test.txt

+1
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,7 @@ PATTERN MODIFIERS
566566
ascii_bsd set PCRE2_EXTRA_ASCII_BSD
567567
ascii_bss set PCRE2_EXTRA_ASCII_BSS
568568
ascii_bsw set PCRE2_EXTRA_ASCII_BSW
569+
ascii_digit set PCRE2_EXTRA_ASCII_DIGIT
569570
ascii_posix set PCRE2_EXTRA_ASCII_POSIX
570571
auto_callout set PCRE2_AUTO_CALLOUT
571572
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL

src/pcre2.h.generic

+1
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,7 @@ D is inspected during pcre2_dfa_match() execution
158158
#define PCRE2_EXTRA_ASCII_BSS 0x00000200u /* C */
159159
#define PCRE2_EXTRA_ASCII_BSW 0x00000400u /* C */
160160
#define PCRE2_EXTRA_ASCII_POSIX 0x00000800u /* C */
161+
#define PCRE2_EXTRA_ASCII_DIGIT 0x00001000u /* C */
161162

162163
/* These are for pcre2_jit_compile(). */
163164

src/pcre2.h.in

+1
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,7 @@ D is inspected during pcre2_dfa_match() execution
158158
#define PCRE2_EXTRA_ASCII_BSS 0x00000200u /* C */
159159
#define PCRE2_EXTRA_ASCII_BSW 0x00000400u /* C */
160160
#define PCRE2_EXTRA_ASCII_POSIX 0x00000800u /* C */
161+
#define PCRE2_EXTRA_ASCII_DIGIT 0x00001000u /* C */
161162

162163
/* These are for pcre2_jit_compile(). */
163164

src/pcre2_compile.c

+4-2
Original file line numberDiff line numberDiff line change
@@ -786,7 +786,8 @@ are allowed. */
786786
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES|PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL| \
787787
PCRE2_EXTRA_ESCAPED_CR_IS_LF|PCRE2_EXTRA_ALT_BSUX| \
788788
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK|PCRE2_EXTRA_ASCII_BSD| \
789-
PCRE2_EXTRA_ASCII_BSS|PCRE2_EXTRA_ASCII_BSW|PCRE2_EXTRA_ASCII_POSIX)
789+
PCRE2_EXTRA_ASCII_BSS|PCRE2_EXTRA_ASCII_BSW|PCRE2_EXTRA_ASCII_POSIX| \
790+
PCRE2_EXTRA_ASCII_DIGIT)
790791

791792
/* Compile time error code numbers. They are given names so that they can more
792793
easily be tracked. When a new number is added, the tables called eint1 and
@@ -3581,7 +3582,8 @@ while (ptr < ptrend)
35813582

35823583
#ifdef SUPPORT_UNICODE
35833584
if ((options & PCRE2_UCP) != 0 &&
3584-
(xoptions & PCRE2_EXTRA_ASCII_POSIX) == 0)
3585+
(xoptions & PCRE2_EXTRA_ASCII_POSIX) == 0 &&
3586+
!(posix_class == 7 && (xoptions & PCRE2_EXTRA_ASCII_DIGIT) != 0))
35853587
{
35863588
int ptype = posix_substitutes[2*posix_class];
35873589
int pvalue = posix_substitutes[2*posix_class + 1];

src/pcre2test.c

+3-1
Original file line numberDiff line numberDiff line change
@@ -651,6 +651,7 @@ static modstruct modlist[] = {
651651
{ "ascii_bsd", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ASCII_BSD, CO(extra_options) },
652652
{ "ascii_bss", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ASCII_BSS, CO(extra_options) },
653653
{ "ascii_bsw", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ASCII_BSW, CO(extra_options) },
654+
{ "ascii_digit", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ASCII_DIGIT, CO(extra_options) },
654655
{ "ascii_posix", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ASCII_POSIX, CO(extra_options) },
655656
{ "auto_callout", MOD_PAT, MOD_OPT, PCRE2_AUTO_CALLOUT, PO(options) },
656657
{ "bad_escape_is_literal", MOD_CTC, MOD_OPT, PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL, CO(extra_options) },
@@ -4294,13 +4295,14 @@ show_compile_extra_options(uint32_t options, const char *before,
42944295
const char *after)
42954296
{
42964297
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
4297-
else fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s",
4298+
else fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
42984299
before,
42994300
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
43004301
((options & PCRE2_EXTRA_ALT_BSUX) != 0)? " alt_bsux" : "",
43014302
((options & PCRE2_EXTRA_ASCII_BSD) != 0)? " ascii_bsd" : "",
43024303
((options & PCRE2_EXTRA_ASCII_BSS) != 0)? " ascii_bss" : "",
43034304
((options & PCRE2_EXTRA_ASCII_BSW) != 0)? " ascii_bsw" : "",
4305+
((options & PCRE2_EXTRA_ASCII_DIGIT) != 0)? " ascii_digit" : "",
43044306
((options & PCRE2_EXTRA_ASCII_POSIX) != 0)? " ascii_posix" : "",
43054307
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
43064308
((options & PCRE2_EXTRA_CASELESS_RESTRICT) != 0)? " caseless_restrict" : "",

testdata/testinput5

+9-1
Original file line numberDiff line numberDiff line change
@@ -1215,6 +1215,8 @@
12151215

12161216
/[[:digit:]]/B,ucp
12171217

1218+
/[[:digit:]]/B,ucp,ascii_digit
1219+
12181220
/[[:graph:]]/B,ucp
12191221

12201222
/[[:print:]]/B,ucp
@@ -1227,7 +1229,7 @@
12271229

12281230
/[[:xdigit:]]/B,ucp
12291231

1230-
# Unicode properties for \b abd \B
1232+
# Unicode properties for \b and \B
12311233

12321234
/\b...\B/utf,ucp
12331235
abc_
@@ -2431,6 +2433,12 @@
24312433
/[[:digit:]]+/utf,ucp
24322434
123\x{660}456
24332435

2436+
/[[:digit:]]+/utf,ucp,ascii_digit
2437+
123\x{660}456
2438+
2439+
/[[:digit:]]+/g,utf,ucp,ascii_digit
2440+
123\x{660}456
2441+
24342442
/[[:digit:]]+/utf,ucp,ascii_posix
24352443
123\x{660}456
24362444

testdata/testinput7

+8-2
Original file line numberDiff line numberDiff line change
@@ -1657,7 +1657,7 @@
16571657
/^[\p{Xwd}]+/utf
16581658
ABCD1234\x{6ca}\x{a6c}\x{10a7}_
16591659

1660-
# Unicode properties for \b abd \B
1660+
# Unicode properties for \b and \B
16611661

16621662
/\b...\B/utf,ucp
16631663
abc_
@@ -2435,9 +2435,15 @@
24352435
/[[:digit:]]+/utf,ucp
24362436
123\x{660}456
24372437

2438+
/[[:digit:]]+/utf,ucp,ascii_digit
2439+
123\x{660}456
2440+
2441+
/[[:digit:]]+/g,utf,ucp,ascii_digit
2442+
123\x{660}456
2443+
24382444
/[[:digit:]]+/utf,ucp,ascii_posix
24392445
123\x{660}456
2440-
2446+
24412447
/>[[:space:]]+</utf,ucp
24422448
>\x{a0} \x{a0}<
24432449
>\x{a0}\x{a0}\x{a0}<

testdata/testoutput5

+18-1
Original file line numberDiff line numberDiff line change
@@ -2520,6 +2520,14 @@ No match
25202520
End
25212521
------------------------------------------------------------------
25222522

2523+
/[[:digit:]]/B,ucp,ascii_digit
2524+
------------------------------------------------------------------
2525+
Bra
2526+
[0-9]
2527+
Ket
2528+
End
2529+
------------------------------------------------------------------
2530+
25232531
/[[:graph:]]/B,ucp
25242532
------------------------------------------------------------------
25252533
Bra
@@ -2568,7 +2576,7 @@ No match
25682576
End
25692577
------------------------------------------------------------------
25702578

2571-
# Unicode properties for \b abd \B
2579+
# Unicode properties for \b and \B
25722580

25732581
/\b...\B/utf,ucp
25742582
abc_
@@ -5359,6 +5367,15 @@ No match
53595367
123\x{660}456
53605368
0: 123\x{660}456
53615369

5370+
/[[:digit:]]+/utf,ucp,ascii_digit
5371+
123\x{660}456
5372+
0: 123
5373+
5374+
/[[:digit:]]+/g,utf,ucp,ascii_digit
5375+
123\x{660}456
5376+
0: 123
5377+
0: 456
5378+
53625379
/[[:digit:]]+/utf,ucp,ascii_posix
53635380
123\x{660}456
53645381
0: 123

0 commit comments

Comments
 (0)