Description
We propose to add methods to regexp that allow iterating over matches instead of having to accumulate all the matches into a slice.
This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.
Regexp has a lot of methods that return slices of all matches (the “FindAll*” methods). Each should have an iterator equivalent that doesn’t build the slice. They can be named by removing the “Find” prefix. The docs would change as follows. (Plain text is unchanged; strikethrough is removed, bold is added):
There are
1624 methods of Regexp that match a regular expression and identify the matched text. Their names are matched by this regular expression:(Find|All|FindAll)?(String)?(Submatch)?(Index)?
If 'All' is present, the routine matches successive non-overlapping matches of the entire expression.
The ‘Find’ form returns the first match. The ‘All’ form returns an iterator over all matches.
Empty matches abutting a preceding match are ignored.
TheFor ‘FindAll’, the return value is a slice containing the successive return values of the correspondingnon-’All’non-‘Find’ routine.TheseThe ‘FindAll’ routines take an extra integer argument, ...
Instead of enumerating all eight methods here, let’s just show one example.
FindAllString currently reads:
// FindAllString is the 'All' version of FindString; it returns a slice of all
// successive matches of the expression, as defined by the 'All' description in
// the package comment. A return value of nil indicates no match.
func (re *Regexp) FindAllString(s string, n int) []string
This would change to become a pair of methods:
// FindAllString is the
'All''FindAll' version of FindString; it returns a slice of all
// successive matches of the expression, as defined by the'All''FindAll' description in
// the package comment. A return value of nil indicates no match.
func (re *Regexp) FindAllString(s string, n int) []string
// AllString is the ‘All’ version of ‘FindString’; it returns an iterator over all
// successive matches of the expression, as defined by the ‘All’ description in
// the package comment.
func (re *Regexp) AllString(s string) iter.Seq[[]string]
The full list is:
// All is the ‘All’ version of ‘Find’: it returns an iterator over all ...
func (re *Regexp) All(b []byte) iter.Seq[[]byte]// AllIndex is the ‘All’ version of ‘FindIndex’: it returns an iterator over all ...
func (re *Regexp) AllIndex(b []byte) iter.Seq[[]int]// AllString is the ‘All’ version of ‘FindString’: it returns an iterator over all ...
func (re *Regexp) AllString(s string) iter.Seq[string]// AllStringIndex is the ‘All’ version of ‘FindStringIndex’: it returns an iterator over all ...
func (re *Regexp) AllStringIndex(s string) iter.Seq[[]int]// AllStringSubmatch is the ‘All’ version of ‘FindStringSubmatch’: it returns an iterator ...
func (re *Regexp) AllStringSubmatch(s string) iter.Seq[[]string]// AllStringSubmatchIndex is the ‘All’ version of ‘FindStringSubmatchIndex’: it returns ...
func (re *Regexp) AllStringSubmatchIndex(s string) iter.Seq[[]int]// AllSubmatch is the ‘All’ version of ‘FindSubmatch’: it returns an iterator over all ...
func (re *Regexp) AllSubmatch(b []byte) iter.Seq[[][]byte]// AllSubmatchIndex is the ‘All’ version of ‘FindSubmatchIndex’: it returns an iterator ...
func (re *Regexp) AllSubmatchIndex(b []byte) iter.Seq[[]int]
There would also be a new SplitSeq method alongside regexp.Regexp.Split, completing the analogy with strings.Split and strings.SplitSeq.
// SplitSeq returns an iterator over substrings of s separated by the expression.
func (re *Regexp) SplitSeq(s string) iter.Seq[string]
Metadata
Metadata
Assignees
Type
Projects
Status