Skip to content

ES Proposal, specs, tests, reference implementation, and polyfill/shim for String.prototype.matchAll

License

Notifications You must be signed in to change notification settings

littledan/String.prototype.matchAll

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

String.prototype.matchAll

Proposal, specs, tests, reference implementation, and polyfill/shim for String.prototype.matchAll

Spec

You can view the spec in markdown format or rendered as HTML.

Rationale

If I have a string, and either a sticky or a global regular expression which has multiple capturing groups, I often want to iterate through all of the matches. Currently, my options are the following:

	var regex = /t(e)(st(\d?))/g;
	var string = 'test1test2';

	string.match(regex); // gives ['test1', 'test2'] - how do i get the capturing groups?

	var matches = [];
	var lastIndexes = {};
	var match;
	lastIndexes[regex.lastIndex] = true;
	while (match = regex.exec(string)) {
		lastIndexes[regex.lastIndex] = true;
		matches.push(match);
		// example: ['test1', 'e', 'st1', '1'] with properties `index` and `input`
	}
	matches; /* gives exactly what i want, but uses a loop,
			* and mutates the regex's `lastIndex` property */
	lastIndexes; /* ideally should give { 0: true } but instead
			* will have a value for each mutation of lastIndex */

	var matches = [];
	string.replace(regex, function () {
		var match = Array.prototype.slice.call(arguments, 0, -2);
		match.input = arguments[arguments.length - 1];
		match.index = arguments[arguments.length - 2];
		matches.push(match);
		// example: ['test1', 'e', 'st1', '1'] with properties `index` and `input`
	});
	matches; /* gives exactly what i want, but abuses `replace`,
		  * mutates the regex's `lastIndex` property,
		  * and requires manual construction of `match` */

The first example does not provide the capturing groups, so isn’t an option. The latter two examples both visibly mutate lastIndex - this is not a huge issue (beyond ideological) with built-in RegExps, however, with subclassable RegExps in ES6/ES2015, this is a bit of a messy way to obtain the desired information on all matches.

Thus, String#matchAll would solve this use case by both providing access to all of the capturing groups, and not visibly mutating the regular expression object in question.

Iterator versus Array

Many use cases may want an array of matches - however, clearly not all will. Particularly large numbers of capturing groups, or large strings, might have performance implications to always gather all of them into an array. By returning an iterator, it can trivially be collected into an array with the spread operator or Array.from if the caller wishes to, but it need not.

Previous discussions

Naming

The name matchAll was selected to correspond with match, and to connote that all matches would be returned, not just a single match. This includes the connotation that the provided regex will be used with a global flag, to locate all matches in the string. An alternate name has been suggested, matches - this follows the precedent set by keys/values/entries, which is that a plural noun indicates that it returns an iterator. However, includes returns a boolean. When the word is not unambiguously a noun or a verb, "plural noun" doesn't seem as obvious a convention to follow.

Update from committee feedback: ruby uses the word scan for this, but the committee is not comfortable introducing a new word to JavaScript. matchEach was suggested, but some were not comfortable with the naming similarity to forEach while the API was quite different. matchAll seems to be the name everyone is most comfortable with.

About

ES Proposal, specs, tests, reference implementation, and polyfill/shim for String.prototype.matchAll

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 59.6%
  • JavaScript 24.7%
  • CSS 15.7%