-
Notifications
You must be signed in to change notification settings - Fork 61
Enhance search performance #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
mourner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a really awesome improvement! Thanks for a great contribution. 👍
index.js
Outdated
| const nodeMinX = this._boxes[pos + 0]; | ||
| const nodeMinY = this._boxes[pos + 1]; | ||
| const nodeMaxX = this._boxes[pos + 2]; | ||
| const nodeMaxY = this._boxes[pos + 3]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: would early rejects with if / continue after every of the four reads here make any difference, or not worth the added verbosity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not seem to make a difference when trying that.
|
Also wondering: is this increase for smaller queries a fluke due or is there an overhead?
|
I would say both is possible here. The current benchmark is not sophisticated enough to track this down. Overall the time for small queries seems to scatters signicantly with every run. |
After some more benchmarking there seems to be an apparent overhead if the number of rectangles in the search area are low. Edit: After some more debugging this seems to be also present if the relevant code is unreachable which indicates that it is related to the javascript engine or javascript timers not being precise in the first place. |
|
@muendlein might be worth increasing the number of searches between timings for a possibly more reliable measurement. Does the last commit help? I'll likely land this anyway eventually since it's an obvious net positive, just wanted to explore whether we could minimize the hit on smaller queries. |
@mourner The last commit has only been a minor improvement which doesn't close the gap. |
One idea is to set an empyrical threshold of the query area compared to data bounds, over which we'll apply the "all in query bounds" logic, and under which we'll leave the existing logic. The overhead for calculating that area for a single query should be small. |
Edit: At least the topic of unreachable functions can be solved with a separate function. @mourner Already tried something similar which does not have an impact. All my testing points towards JIT compiler issues. To make it more clear here are two unreachable examples. Example 1: This is exactly the old logic apart from the additional Before: This PR: Unreachable example 1: Example 2: Same as example 1 but with the majority of the logic removed inside the unreachable if statement. Now you can see that the performance of this example is the same as before. Right now I don't have any good explanation for this except for certain compiler optimizations. Before: This PR: Unreachable example 2: |
|
@muendlein I've seen similar behavior before, and my guess is that it's because of v8 inlining. Over a certain threshold of complexity or size, v8 stops inlining the function, which makes it slower for small payloads. Might be worth experimenting with cutting out some of the logic into a separate function so that most of the hot path code in |
@mourner Just tested this (see my edit above), at least it will fix the topic of unreachable code paths. But for now I only see small improvements in the "real" scenario. |
|
With the latest commit the gap has been reduced but is still there. |
|
@muendlein this one looks much better!
|
|
@mourner Inlining What exactly is your idea about the heuristic approach to check the relative query size? Especially for unbalanced distributions I'm not sure if this is even feasible. |
|
@muendlein so, if I do fb78a2e and then put |
I assume you mean This brings me to the next point, namely how create an empirical yet reliable yet fast |
It's high compared to the case when we guess right; however it's very small compared to not landing this PR. So we need to decide what's better: accept a guaranteed notable performance drop for small queries (arguably the more prevalent case in real world apps than big queries), or accept that we'll sometimes guess wrong and the performance will be as before. I'd probably try the simplest metric possible, e.g. |
|
@mourner I think you are asking the right question about what is the better choice here. Given the varity of datasets & query usecases, I think 1 solution fits all won't exist. For example your proposed simple metric can work great for well distributed datasets but may yield undesired performance in case of strongly unbalanced datasets. Threshold -> 0: always run the optimized path (best performance for large queries) This ensures that users themselves can optimize search performance for their respective dataset. |
|
👍 to exposing an option if the regression is unavoidable. "mouse hover single data point in a scatterplot of 300k points" is something i would prefer not to regress. |
@leeoniya Even with the regression, I'm pretty sure you won't observe any difference as it relative not absolute. When talking about hovering there are much slower processes involved. |
|
I'm hesitant about introducing such an option, because I'd like to keep the library simple, minimal and working perfectly out of the box, and this parameter is pretty confusing and difficult to explain. I think it's fine if there are some weird edge cases with heavily imbalanced datasets where the optimization doesn't kick in, as long as the library as a whole performs great most of the time. So I'd still try to explore the heuristical approach, if there are no other ideas on how to address the small query regression. |
|
@mourner I just tested the simplest heuristics approach and it seems like we are back fighting the compiler. Basically as soon as |
|
@muendlein all right, let this sit for a few days more, I'll try to play with it a bit... As a last resort, we could just add a duplicate method, e.g. |
|
@mourner As some time has passed, I'm wondering if you already had time to play around? |
Fixes issue: #60
before
1000 searches 100%: 34.013s
1000 searches 75%: 25.787s
1000 searches 50%: 16.775s
1000 searches 25%: 9.530s
1000 searches 10%: 3.225s
1000 searches 1%: 280.145ms
1000 searches 0.01%: 11.675ms
after
1000 searches 100%: 18.888s
1000 searches 75%: 13.866s
1000 searches 50%: 9.127s
1000 searches 25%: 5.771s
1000 searches 10%: 1.963s
1000 searches 1%: 208.407ms
1000 searches 0.01%: 15.793ms