Skip to content

Commit e143b79

Browse files
committed
updated parser Intents to accept stem typos with 1 character missing
- updated parser Intents _scramble to add variations to typo_stem where 1 character (other than the first and last) are missing for stems longer than 5 characters - added test_case for this and for typo character scrambling - added example for this on README.md - increased version number to 1.2.2
1 parent 1b15a81 commit e143b79

File tree

4 files changed

+12
-2
lines changed

4 files changed

+12
-2
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Will `match` the intent `"to_do"` in the following sentences:
3838
- Teljesen **kicsinálva** érzem magamat ettől a melegtől.
3939
- **Csinálhatott** volna mást is.
4040
- **Visszacsinalnad** az ekezeteket a billentyuzetemen, kerlek?
41+
- Vigyázz, hogy el ne gépeld a **csniálni** igét!
4142

4243
By defining the `wordclass` and `stem` of a keyword, **Lara** will generate possible patterns for text matching, without having to rely on large dictionaries!
4344

lara/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
# Lara - Lingusitic Aim Recognizer API
44

55
__all__ = 'nlp','parser','stemmer','entities'
6-
__version__ = '1.2.1'
6+
__version__ = '1.2.2'
77
__version_info__ = tuple(int(num) for num in __version__.split('.'))
88

99
import lara.nlp

lara/parser.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,8 +238,13 @@ def _generate(self, item):
238238
def _scramble(self,text,is_adjective=False):
239239
if len(text)>3:
240240
typo = [text[1:-1]]
241+
remove_one = bool(len(text)>5)
241242
for i in range(len(text)-3):
242243
typo.append(re.escape(text[1:i+1]+text[i+2]+text[i+1]+text[i+3:-1]))
244+
if remove_one:
245+
typo.append(re.escape(text[1:i+1]+text[i+2]+text[i+3:-1]))
246+
if remove_one:
247+
typo.append(re.escape(text[1:-2]))
243248
is_consonant= lara.nlp.is_consonant(text[-1])
244249
text = re.escape(text[0])+'(?:'+('|'.join(typo))+')(?:'+re.escape(text[-1])
245250
text = '[\s\-]?'.join(text.split('\ '))

tests/test_parser.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,9 @@ def test_parser_intents_add(intents,text,match):
6868
"Megcsináltatták a berendezést.",
6969
"Teljesen kicsinálva érzem magamat ettől a melegtől.",
7070
"Csinálhatott volna mást is.",
71-
"Visszacsinalnad az ekezeteket a billentyuzetemen, kerlek?"
71+
"Visszacsinalnad az ekezeteket a billentyuzetemen, kerlek?",
72+
"Szépen megcsiáltad a feladatot csak kihagytál egy karaktert!",
73+
"Vigyázz, hogy el ne gépeld a csniálni igét!"
7274
],
7375
[
7476
{'to_do': 2},
@@ -77,6 +79,8 @@ def test_parser_intents_add(intents,text,match):
7779
{'to_do': 2},
7880
{'to_do': 2},
7981
{'to_do': 2},
82+
{'to_do': 1},
83+
{'to_do': 1},
8084
{'to_do': 1}
8185
]
8286
),

0 commit comments

Comments
 (0)