Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low-hanging fruit from the R repository #106

Merged
merged 15 commits into from
Aug 16, 2024
Prev Previous commit
Next Next commit
Fix dynamic narrow mode.
* Rename `cur_max` to `mb_cur_max` to avoid confusion with `curr_max`.
* Adjust some comments, drop reference to R bugzilla ticket.
* Fix one case where we were still checking the static `TRE_MB_CUR_MAX`
  instead of the dynamic `mb_cur_max`.
* Drop an invalid assertion.

Inspired by the R repository.

Fixes: 8e84229
  • Loading branch information
dag-erling committed Jul 30, 2024
commit f54c02b4081f0f43a4f5274d2acf9ff59880a7c4
6 changes: 3 additions & 3 deletions lib/tre-compile.c
Original file line number Diff line number Diff line change
Expand Up @@ -1880,8 +1880,8 @@ tre_compile(regex_t *preg, const tre_char_t *regex, size_t n, int cflags)
parse_ctx.len = n;
parse_ctx.cflags = cflags;
parse_ctx.max_backref = -1;
/* workaround for PR#14408: use 8-bit optimizations in 8-bit mode */
parse_ctx.cur_max = (cflags & REG_USEBYTES) ? 1 : TRE_MB_CUR_MAX;
/* Use 8-bit optimizations in 8-bit mode */
parse_ctx.mb_cur_max = (cflags & REG_USEBYTES) ? 1 : TRE_MB_CUR_MAX;
DPRINT(("tre_compile: parsing '%.*" STRF "'\n", (int)n, regex));
errcode = tre_parse(&parse_ctx);
if (errcode != REG_OK)
Expand Down Expand Up @@ -2021,7 +2021,7 @@ tre_compile(regex_t *preg, const tre_char_t *regex, size_t n, int cflags)
/* If in eight bit mode, compute a table of characters that can be the
first character of a match. */
tnfa->first_char = -1;
if (TRE_MB_CUR_MAX == 1 && !tmp_ast_l->nullable)
if (parse_ctx.mb_cur_max == 1 && !tmp_ast_l->nullable)
{
int count = 0;
tre_cint_t k;
Expand Down
3 changes: 1 addition & 2 deletions lib/tre-parse.c
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,6 @@ tre_expand_ctype(tre_mem_t mem, tre_ctype_t class, tre_ast_node_t ***items,
reg_errcode_t status = REG_OK;
tre_cint_t c;
int j, min = -1, max = 0;
assert(TRE_MB_CUR_MAX == 1);

DPRINT((" expanding class to character ranges\n"));
for (j = 0; (j < 256) && (status == REG_OK); j++)
Expand Down Expand Up @@ -332,7 +331,7 @@ tre_parse_bracket_items(tre_parse_ctx_t *ctx, int negate,
if (!class)
status = REG_ECTYPE;
/* Optimize character classes for 8 bit character sets. */
if (status == REG_OK && ctx->cur_max == 1)
if (status == REG_OK && ctx->mb_cur_max == 1)
{
status = tre_expand_ctype(ctx->mem, class, items,
&i, &max_i, ctx->cflags);
Expand Down
4 changes: 2 additions & 2 deletions lib/tre-parse.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ typedef struct {
int nofirstsub;
/* The currently set approximate matching parameters. */
int params[TRE_PARAM_LAST];
/* the CUR_MAX in use */
int cur_max;
/* the MB_CUR_MAX in use */
int mb_cur_max;
} tre_parse_ctx_t;

/* Parses a wide character regexp pattern into a syntax tree. This parser
Expand Down