-
Notifications
You must be signed in to change notification settings - Fork 6.8k
bugfix: modified table predicate filter #21362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance-wise, I'm concerned that it's more expensive to call toLowerCase
and includes
for each column instead of once for each row after concatenating the values together. I'm sure the difference is small for most tables, but the solution may not scale to tables with many columns
I will run a performance benchmark in an online benchmark comparison and post a link |
After testing using JSBench i have found the original algorithm to be 68% slower on average in the tests I made, on my machine. This is due to the overhead string concatenation as well as running toLowerCase and trim on a larger string being higher than the overhead of calling the function on each column(function call overhead in v8 is quite low when called in succession). You can confirm with the following jsbench snippet: https://jsbench.me/4bkjp85g44/1 |
Bumping because of inactivity. |
@daniel-brenot Sorry for the inactivity - the change looks okay to me and we can run it by internal tests to confirm that it works well. Can you rebase? |
I pulled the upstream into the the branch and it should be good now. Let me know if theres anything else i need to do. |
I'm still seeing a note that "This branch cannot be rebased due to conflicts". You may want to perform a rebase with the original commit |
Will try to land this in #30172 |
This issue has been automatically locked due to inactivity. Read more about our automatic conversation locking policy. This action has been performed automatically by a bot. |
Replaces the string concatenation and index search with the .some check and a contains check. This should not be less efficient since the data is only getting checked once per row instead of twice, and the row will stop being looped over if a match is found earlier. Also removes unlikely edge case of a user inputing an obscure unicode character.