Closed
Description
Regex
currently has two constructors, Regex::new
and Regex::with_size_limit
. The latter is the same as the former, except it allows one to bound the size of the compiled program. The idea is here to control how much memory is used if a regex is compiled from an untrusted source.
There are other knobs that seem useful for exposing to the user:
- The recursion limit used in
regex-syntax
for recursively simplifying a regex. (It'd be nice to move this to a stack on the heap, but it seems tricky.) - The cache size used in the lazy DFA. For regexes that create lots of distinct states, it's possible to realize big gains if you're willing to spend the memory to do it. Currently, it is set to a constant of ~2MB.
There are other knobs that maybe shouldn't be exposed, but could:
- Control the budget for extracting literal prefixes.
- Control whether the DFA is used. (There exists regexes and inputs where avoiding the lazy DFA is actually faster, but it's probably hard to know when this is without some experience with the internals of this crate.)
- Control whether common suffixes are factored out of compiled programs. This could reduce compilation times at the expense of bigger programs.
It's not clear what, if any, of these things should be exposed. I feel like the knobs that control memory bounds should be accessible to callers, because when you need them, you really need them.