Skip to content

expose more knobs #166

Closed
Closed
@BurntSushi

Description

@BurntSushi

Regex currently has two constructors, Regex::new and Regex::with_size_limit. The latter is the same as the former, except it allows one to bound the size of the compiled program. The idea is here to control how much memory is used if a regex is compiled from an untrusted source.

There are other knobs that seem useful for exposing to the user:

  • The recursion limit used in regex-syntax for recursively simplifying a regex. (It'd be nice to move this to a stack on the heap, but it seems tricky.)
  • The cache size used in the lazy DFA. For regexes that create lots of distinct states, it's possible to realize big gains if you're willing to spend the memory to do it. Currently, it is set to a constant of ~2MB.

There are other knobs that maybe shouldn't be exposed, but could:

  • Control the budget for extracting literal prefixes.
  • Control whether the DFA is used. (There exists regexes and inputs where avoiding the lazy DFA is actually faster, but it's probably hard to know when this is without some experience with the internals of this crate.)
  • Control whether common suffixes are factored out of compiled programs. This could reduce compilation times at the expense of bigger programs.

It's not clear what, if any, of these things should be exposed. I feel like the knobs that control memory bounds should be accessible to callers, because when you need them, you really need them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions