Updated Readme #324

dheerajck · 2023-04-27T21:04:19Z

Updated readme

Added examples of WRatio, QRatio and updated score values
Added examples of string preprocessing

…lues

README.md

Co-authored-by: Max Bachmann <kontakt@maxbachmann.de>

dheerajck · 2023-04-28T11:20:31Z

Dont you think that the parameter name processor can be confusing, and something like string_preprocessor would be a better name ??

maxbachmann · 2023-04-28T13:04:05Z

Dont you think that the parameter name processor can be confusing, and something like string_preprocessor would be a better name ??

I agree it is not a perfect name. The naming stems from fuzzywuzzy using the named argument processor in their process.* APIs. I added the argument to every scorer, which in hindsight wasn't a great idea. It saves the user very little typing:

Levenshtein.distance(s1, s2, processor=utils.default_process)

vs

Levenshtein.distance(utils.default_process(s1), utils.default_process(s2))

in addition the performance difference is pretty small. For short sequences <16 characters the second implementation appears a couple percent faster and for longer ones calling it internally appears to be around 10% faster. So it only makes a difference when working with very fast scorers like Prefix/Postfix/Hamming and long sequences. Even then when comparing multiple sequences your better off using the scorer with the process.* APIs.

For the process.* APIs that is a different story, since:

it saves more typing
I am able to call the preprocessing function in a more performant way

For these reasons I was actually playing with the thought of deprecating the processor argument in scorers.

dheerajck added 2 commits April 28, 2023 02:09

Updated readme, Added examples of WRatio, QRatio and updated score va…

da41224

…lues

Updated readme, Added examples of string preprocessing

ae7cf1d

dheerajck mentioned this pull request Apr 27, 2023

Change of behaviour from 2.15.0 to 3.0.0 ?? #322

Closed

maxbachmann reviewed Apr 27, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Updated readme, updated reference link

2ad7111

Co-authored-by: Max Bachmann <kontakt@maxbachmann.de>

maxbachmann merged commit e1bf959 into rapidfuzz:main Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated Readme #324

Updated Readme #324

dheerajck commented Apr 27, 2023

dheerajck commented Apr 28, 2023

maxbachmann commented Apr 28, 2023 •

edited

Loading

Updated Readme #324

Updated Readme #324

Conversation

dheerajck commented Apr 27, 2023

dheerajck commented Apr 28, 2023

maxbachmann commented Apr 28, 2023 • edited Loading

maxbachmann commented Apr 28, 2023 •

edited

Loading