A template string based alternative to or an addition for the RegExp.escape proposal. Inspired by the discussion here and on this issue.
This proposal is a stage 0 (strawman) proposal and is awaiting specification, implementation and input.
See this issue. It is often the case when we want to build a regular expression out of a string without treating special characters from the string as special regular expression tokens. For example if we want to replace all occurrences of the the string Hello.
which we got from the user we might be tempted to do ourLongText.replace(new RegExp(text, "g"))
but this would match .
against any character rather than a dot.
This is a fairly common use in regular expressions and standardizing it would be useful.
In other languages there is a way to escape a literal through a tool similar to the RegExp.escape proposal:
- Perl: quotemeta(str) - see the docs
- PHP: preg_quote(str) - see the docs
- Python: re.escape(str) - see the docs
- Ruby: Regexp.escape(str) - see the docs
- Java: Pattern.quote(str) - see the docs
- C#, VB.NET: Regex.Escape(str) - see the docs
Note that the languages differ in what they do - (perl does something different from C#) but they all have the same goal. Most of these languages have template strings but most of them don't have tagged template strings like in ES which enable the syntax and solution format this solution uses.
We propose the addition of an RegExp.tag
function, such that strings can be escaped in order to be used inside regular expressions:
var str = prompt("Please enter a string");
let re = RegExp.tag()`${str}.*${str}A`;
alert(re.test(str)); // check if the string matches against the string twice with A following the second time
There is an alternative proposal here about a RegExp.escape function. Similar to that proposal this one uses the spec's SyntaxCharacter
list of characters so updates are in sync with the specificaiton instead of specifying the characters escaped manually. This is unlike earlier proposals.
##Cross-Cutting Concerns
The list of escaped identifiers should be kept in sync with what the regular expressions grammar considers to be syntax characters that need escaping - for this reason instead of hard-coding the list of escaped characters we escape characters that are recognized as a SyntaxCharacter
s by the engine. For example, if regex comments are ever added to the specification (presumably under a flag) - this ensures they are properly escaped.