-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary length checks + improved indexing of charCodeCache #10
Conversation
Can you explain your changes? Those things are there for a reason. |
@@ -65,7 +53,7 @@ module.exports = function (a, b) { | |||
var j = 0; | |||
|
|||
while (i < aLen) { | |||
charCodeCache[start + i] = a.charCodeAt(start + i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to add start
here to the charCodeCache
, as long as we do the same in the loop.
This make the bench ~4-5% faster on my cpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense indeed. Can't remember why I did that thusly.
@@ -37,10 +29,6 @@ module.exports = function (a, b) { | |||
bLen--; | |||
} | |||
|
|||
if (aLen === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is taken care of in the prefix loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, the check at line 56 kicks in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again this is an early exit and result should be benchmarked.
return bLen; | ||
} | ||
|
||
if (bLen === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here aLen is always smaller or equal than bLen due to the swap, i.e. always false (if not or equal to aLen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what you mean is, that if bLen
is 0
, then aLen
must be 0
as well and the check at line 56
will kick in again and returns 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, here I guess that since we swapped the strings, we can remove this condition since if both strings have 0 length, condition in line 7 will handle it.
@@ -20,14 +20,6 @@ module.exports = function (a, b) { | |||
var aLen = a.length; | |||
var bLen = b.length; | |||
|
|||
if (aLen === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is taken care of in the suffix loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If aLen
is equal to 0
, the check at line 56 kicks in ending up in the same result. So yes, the result is the same. However, this is an early exit and thus you can argue that the faster we can exit and the less code we should execute, the better. But yes, this one can safely be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right indeed. But I guess we should benchmark the change to see if this yields any positive result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed it could be faster in some corner-cases
In general, I think removing or keeping the checks will not make a big difference performance-wise. Maybe keep some of them for clarity? |
I guess the line 27 check can be removed. One solution would be to comment the checks clearly as early exits. |
I will come back to this after work today |
Can we speak about this also? You seem to have an edge there :). Haven't had much time to read your code but it seems you can speed up computations by performing several computations in one iteration of the loop? Is that it? |
Yes thats it, I do loop unrolling on the outer loop and calculating 4 rows at a time in the inner loop :) I did some reviewing regarding the checks I came to the following:
In case 1 and 3 the checks makes it slightly faster, but makes all "fall through" cases slightly slower. What do you think? |
Sorry but I might misunderstand what was decided, did you want me to change the PR to keep the checks? |
@Yomguithereal What do you think? |
I don't remember clearly where we were but I guess the idea would be, concerning the remaining and maybe redundant checks, to only keep them if the benchmark differences are negligible and if it enhances readability of the algorithm so one does not have to understand that we don't make such check here because it can be found aggregated with another one somewhere else. |
I agree that we can drop n°2. For the other ones, I guess the "fall through" scenario is the most common one to be expected in real use cases. So @gustf would you advocate to remove them (by how much is that faster in your benchmarks)? |
Afterwards, with your accord @gustf, I guess we could work on integrating and understanding your loop unrolling into this library? On a side note, the Levensthein distance is often used to see if two strings are similar, which can be said if their distance is less than some threshold. In those cases, you will gain tremendous performance by using a capped Levenshtein distance function such as this one. |
@Yomguithereal Sorry for late reply. I haven't done any benchmarking with or without the checks but I think it will not really make a big difference. But since there will be less comparisons in the general case I think removing them makes sense. And sure, we can do some work integrating the loop unrolling. I will have some more time in a few weeks. Very busy at work now before summer holidays. |
Well I guess this can be merged then, @sindresorhus? |
## The dependency [leven](https://github.com/sindresorhus/leven) was updated from `2.1.0` to `3.0.0`. This version is **not covered** by your **current version range**. If you don’t accept this pull request, your project will work just like it did before. However, you might be missing out on a bunch of new features, fixes and/or performance improvements from the dependency update. --- <details> <summary>Release Notes for v3.0.0</summary> <p>Breaking:</p> <ul> <li>Require Node.js 6 (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="419602561" data-permission-text="Issue title is private" data-url="sindresorhus/leven#12" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/12/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/12">#12</a>) <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><tt>3eb8c04</tt></a></li> </ul> <p>Enhancements:</p> <ul> <li>Improve performance (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="228838401" data-permission-text="Issue title is private" data-url="sindresorhus/leven#10" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/10/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/10">#10</a>) <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88"><tt>37573fe</tt></a></li> <li>Add TypeScript definition (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="419602561" data-permission-text="Issue title is private" data-url="sindresorhus/leven#12" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/12/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/12">#12</a>) <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><tt>3eb8c04</tt></a></li> </ul> <p><a class="commit-link" href="https://urls.greenkeeper.io/sindresorhus/leven/compare/v2.1.0...v3.0.0"><tt>v2.1.0...v3.0.0</tt></a></p> </details> <details> <summary>Commits</summary> <p>The new version differs by 4 commits.</p> <ul> <li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/254cdab92b2a3154dfb219ec0fcaa9645096e0f1"><code>254cdab</code></a> <code>3.0.0</code></li> <li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/06fda13f82dd3b86581f6815fa6e6a0b8cbed733"><code>06fda13</code></a> <code>Add a more realistic benchmark fixture</code></li> <li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><code>3eb8c04</code></a> <code>Require Node.js 6, add TypeScript definition (#12)</code></li> <li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88"><code>37573fe</code></a> <code>Remove unnecessary length checks + improved indexing of charCodeCache (#10)</code></li> </ul> <p>See the <a href="https://urls.greenkeeper.io/sindresorhus/leven/compare/0630566a69b5a73aae2e52bb47ea863892a4b5f0...254cdab92b2a3154dfb219ec0fcaa9645096e0f1">full diff</a></p> </details> <details> <summary>FAQ and help</summary> There is a collection of [frequently asked questions](https://greenkeeper.io/faq.html). If those don’t help, you can always [ask the humans behind Greenkeeper](https://github.com/greenkeeperio/greenkeeper/issues/new). </details> --- Your [Greenkeeper](https://greenkeeper.io) bot 🌴
removed unnecessary length checks + improved indexing of charCodeCache