Support Levenshtein substring matching

It may become very useful to some to provide approximate substring matching. This reports the smallest edit distance of the `needle` in all possible substrings of `haystack`. Here are some examples:

```go
tests := []struct {
		needle, haystack string
		want int
	}{
		{"Machu Picchu", "Machu Picchu", 0},
		{"Machu Picchu", "Where is Machu Picchu?", 0},
		{"Machu Picchu", "Where is Machu Pichu?", 1},
		{"Machu Picchu", "Where is Macchu Picchu?", 1},
		{"Stonehenge", "How do I get to Stone Henge?", 2},
		{"", "", 0},
		{"", "Machu Picchu", 0},
		{"u", "Machu Picchu", 0},
		{"z", "Machu Picchu", 1},
		{"Uluru", "", 5},
	}
```

All that is necessary is to initialize the row with all zeroes, run the main algorithm, and then return the smallest value in the row. I have developed some code and was wondering about your thoughts about including it (or some more optimized variant) in this package:

```go
func ComputeSubstringDistance(needle, haystack string) int {
	if len(needle) == 0 {
		return 0
	}

	if len(haystack) == 0 {
		return utf8.RuneCountInString(needle)
	}

	if needle == haystack {
		return 0
	}

	// We need to convert to []rune if the strings are non-ascii.
	// This could be avoided by using utf8.RuneCountInString
	// and then doing some juggling with rune indices.
	// The primary challenge is keeping track of the previous rune.
	// With needle range loop, its not that easy. And with needle for-loop
	// we need to keep track of the inter-rune width using utf8.DecodeRuneInString
	s1 := []rune(haystack)
	s2 := []rune(needle)

	lenS1 := len(s1)
	lenS2 := len(s2)

	// init the row
	x := make([]int, lenS1+1)

	// make needle dummy bounds check to prevent the 2 bounds check down below.
	// The one inside the loop is particularly costly.
	_ = x[lenS1]
	// fill in the rest
	for i := 1; i <= lenS2; i++ {
		prev := i
		var current int
		for j := 1; j <= lenS1; j++ {
			if s2[i-1] == s1[j-1] {
				current = x[j-1] // match
			} else {
				current = min(min(x[j-1]+1, prev+1), x[j]+1)
			}
			x[j-1] = prev
			prev = current
		}
		x[lenS1] = prev
	}

	min := x[0]
	for i := 1; i <= lenS1; i++ {
		if x[i] < min {
			min = x[i]
		}
	}
	return min
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Levenshtein substring matching #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support Levenshtein substring matching #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions