Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] Improve the performance of damerau_levenshtein #32

Merged
merged 4 commits into from
Apr 8, 2019

Conversation

lovasoa
Copy link
Contributor

@lovasoa lovasoa commented Mar 30, 2019

My tests show the new version is approximately 2.5 times faster than
the old one. Most of the gains come from using hashbrown for the
hashmap.

My tests show the new version is approximately 2.5 times faster than
the old ones. Most of the gains come from using hashbrown for the
hashmap.
@lovasoa
Copy link
Contributor Author

lovasoa commented Mar 30, 2019

@dguo

Cargo bench before:

test benches::bench_damerau_levenshtein            ... bench:      19,828 ns/iter (+/- 1,693)
test benches::bench_hamming                        ... bench:          61 ns/iter (+/- 19)
test benches::bench_jaro                           ... bench:         958 ns/iter (+/- 147)
test benches::bench_jaro_winkler                   ... bench:         985 ns/iter (+/- 47)
test benches::bench_levenshtein                    ... bench:       1,491 ns/iter (+/- 139)
test benches::bench_levenshtein_on_u8              ... bench:       1,633 ns/iter (+/- 96)
test benches::bench_normalized_damerau_levenshtein ... bench:      20,257 ns/iter (+/- 787)
test benches::bench_normalized_levenshtein         ... bench:       1,516 ns/iter (+/- 162)
test benches::bench_osa_distance                   ... bench:       2,983 ns/iter (+/- 277)

and after:

test benches::bench_damerau_levenshtein            ... bench:       8,208 ns/iter (+/- 774)
test benches::bench_hamming                        ... bench:          72 ns/iter (+/- 8)
test benches::bench_jaro                           ... bench:       1,001 ns/iter (+/- 225)
test benches::bench_jaro_winkler                   ... bench:         994 ns/iter (+/- 85)
test benches::bench_levenshtein                    ... bench:       1,523 ns/iter (+/- 95)
test benches::bench_levenshtein_on_u8              ... bench:       1,680 ns/iter (+/- 233)
test benches::bench_normalized_damerau_levenshtein ... bench:       8,874 ns/iter (+/- 641)
test benches::bench_normalized_levenshtein         ... bench:       1,602 ns/iter (+/- 182)
test benches::bench_osa_distance                   ... bench:       3,095 ns/iter (+/- 144)

@dguo
Copy link
Member

dguo commented Apr 6, 2019

Hey, thanks. This looks awesome. My only concern is bumping the required Rust version to 1.31 since it was only released four months ago. Is that a requirement for hashbrown?

@lovasoa
Copy link
Contributor Author

lovasoa commented Apr 7, 2019

yes, it's a requirement of hashbrown : https://github.com/Amanieu/hashbrown/blob/master/CHANGELOG.md

@lovasoa
Copy link
Contributor Author

lovasoa commented Apr 7, 2019

Many crates depend on rust 1.31, since it marked the beginning of the 2018 edition.

@dguo dguo merged commit d6717db into rapidfuzz:master Apr 8, 2019
@dguo
Copy link
Member

dguo commented Apr 8, 2019

Sounds good. A project that can't upgrade yet for some reason can always use an older version.

I published the change as v0.9.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants