-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance of mappedColumnNames.contains #3023
Fix performance of mappedColumnNames.contains #3023
Conversation
Hello @kelunik , Please answer the following questions:
I understand that |
Hi @harawata, Yes, this is an actually experienced performance problem. I'm debugging select performance for a mapper locally in isolation and the Before 537ms for After 54ms for The table I test with has 99 columns. |
Thank you, the data is helpful. |
It is definitely better to change the type of If you are keen to update this PR, please do. |
Yes, that's definitely better, I just wasn't sure whether this is an accepted breaking change / whether this is part of the public API of the package. I'll update the PR. |
a739b90
to
a2d6ab0
Compare
Updated. 🚀 |
Thanks! |
If the number of columns is very large, this is quite a performance hit, due to O(n) vs. O(1) runtime on List vs. Set.
a2d6ab0
to
9f93220
Compare
Oops, order indeed matters there, updated. |
Merged. It should be included in the latest 3.5.15-SNAPSHOT. |
Thanks, @harawata! |
@harawata Are there any guidelines when a new release will be tagged? |
There is no clear guideline, but we try to release new version every 2-4 months. |
If the number of columns is very large, this is quite a performance hit, due to O(n) runtime.