An elixir library to do approximate/fuzzy string matching.
If available in Hex, the package can be installed
by adding peach to your list of dependencies in mix.exs:
def deps do
[
{:peach, "~> 0.2.0"}
]
endDocumentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/peach.
To test run mix test.
To test with CSV data, create a folder in the /test/ folder called function_test_data and put the following CSVs in them:
normalise_whitespace.csvremove_punc.csvpre_process.csvremove_emojis.csvnormalise_text.csvreplace_punc.csvget_brief.csvremove_numbers.csv
then run mix test
or mix test.watch test/peach_test.exs --max-failures=1 --seed=0
Below are some examples of how Peach might be used to do the type of fuzzy matching automation required in the first tier of a menu centred chatbot.
input = "2.)"
keyword_set = MapSet.new(["1", "2", "menu"])
matches =
Peach.pre_process(input)
|> Peach.find_exact_match(keyword_set)
assert matches == "2" input = "_menu_"
keyword_set = MapSet.new(["1", "2", "menu"])
matches =
Peach.pre_process(input)
|> Peach.find_exact_match(keyword_set)
assert matches == "menu" input = "menuu"
keyword_set = MapSet.new(["menu", "optin", "optout"])
threshold = 1
matches =
Peach.pre_process(input)
|> Peach.find_fuzzy_matches(keyword_set, threshold)
assert matches == [{"menu", 1}] input = "optint"
keyword_threshold_set = MapSet.new([{"menu", 1}, {"optin", 2}, {"optout", 2}])
matches =
Peach.pre_process(input)
|> Peach.find_fuzzy_matches(keyword_threshold_set)
assert matches == [{"optin", 1}, {"optout", 2}]