You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**[Longest Common Subsequence Distance & Similarity](https://en.wikipedia.org/wiki/Longest_common_subsequence_problem)**: edit with insertion and deletion
68
68
69
69
```python
70
-
import pytextdist
70
+
from pytextdist.edit_distance import lcs_distance, lcs_similarity
> **[Damerau-Levenshtein Distance & Similarity](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance)**: edit with insertion, deletion, substitution, and transposition of two adjacent units
83
+
**[Damerau-Levenshtein Distance & Similarity](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance)**: edit with insertion, deletion, substitution, and transposition of two adjacent units
84
+
85
+
```python
86
+
from pytextdist.edit_distance import damerau_levenshtein_distance, damerau_levenshtein_similarity
87
+
88
+
str_a ='kitten'
89
+
str_b ='sitting'
90
+
dist = damerau_levenshtein_distance(str_a,str_b)
91
+
simi = damerau_levenshtein_similarity(str_a,str_b)
> **[Hamming Distance & Similarity](https://en.wikipedia.org/wiki/Hamming_distance)**: edit with substition
99
+
**[Hamming Distance & Similarity](https://en.wikipedia.org/wiki/Hamming_distance)**: edit with substition; note that hamming metric only works for phrases of the same lengths
100
+
101
+
```python
102
+
from pytextdist.edit_distance import hamming_distance, hamming_similarity
By default functions in this module use unigram (single word) to build vectors and calculate similarity. You can change `n` to other numbers for n-gram (n contiguous words) vector similarity.
from pytextdist.vector_similarity import cosine_similarity
140
+
141
+
phrase_a ='For Paperwork Reduction Act Notice, see your tax return instructions. For Paperwork Reduction Act Notice, see your tax return instructions.'
142
+
phrase_b ='For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see separate instructions. Form 1040'
from pytextdist.vector_similarity import jaccard_similarity
156
+
157
+
phrase_a ='For Paperwork Reduction Act Notice, see your tax return instructions. For Paperwork Reduction Act Notice, see your tax return instructions.'
158
+
phrase_b ='For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see separate instructions. Form 1040'
from pytextdist.vector_similarity import sorensen_dice_similarity
172
+
173
+
phrase_a ='For Paperwork Reduction Act Notice, see your tax return instructions. For Paperwork Reduction Act Notice, see your tax return instructions.'
174
+
phrase_b ='For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see separate instructions. Form 1040'
from pytextdist.vector_similarity import qgram_similarity
188
+
189
+
phrase_a ='For Paperwork Reduction Act Notice, see your tax return instructions. For Paperwork Reduction Act Notice, see your tax return instructions.'
190
+
phrase_b ='For Disclosure, Privacy Act, and Paperwork Reduction Act Notice, see separate instructions. Form 1040'
0 commit comments