-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential performance regression for TPCH q18 #13188
Comments
This may be fixed by fixing #11628 |
take |
I will take a look on this issue too, how is your progress on this @devanbenz ? |
Looking at the following code path: datafusion/datafusion/physical-plan/src/coalesce/mod.rs Lines 246 to 263 in 9005585
I noticed when writing up a small example using this code path locally. When taking a look at the I've identified where the performance fault is--but I'm unsure what next steps to take to try and alleviate it. I was going to maybe modify the coalescer to try and "change the RecordBatch to stringview if the datatype is datafusion/datafusion/physical-plan/src/coalesce_batches.rs Lines 297 to 307 in 9005585
I don't have a meaningful plan yet just have been doing some exploratory work as of right now while benchmarking locally. |
#11628 might have some ideas. I think @XiangpengHao has also been thinking of some way to do this too Basically one thing you might be able to try is to switch from using But as you are hinting at this is a non trivial thing to do |
I finally filed a ticket upstream in arrow trying to explain what I have been thinking about: |
Other than apache/arrow-rs#6692. I think the downside is again we need to implement builder per type like |
Would this be a specialization of a filter of the aggregates like for |
Probably, but I think it has potential to reduce copied too #11628. It aggregates at the first place similar to coalescer's role and no gc required since we aggregate the value without necessary buffer copied. |
If someone else would like to take this please let me know. I may not be able to look at this again until the upcoming weekend. |
Feel free to work on it since this kind of task is usually not that trivial and everyone may come out a different solution (less chance of duplicated work) |
Describe the bug
While enabling
StringView
reading from Parquet in #13101 @Dandandan noticed a slight regression for TPCH 18 #13101 (comment)here is the query
To Reproduce
To reproduce
Make data
Run query:
When StringView is enabled it seems like it is slightly slower
Expected behavior
StringView should always be faster
Additional context
I took a brief look at the flamegraphs -- it seems like one difference could be
BatchCoalescer::push_batch
There is a special case for StringView here:
https://github.com/apache/datafusion/blob/6034be42808b43e3f48f6e58ec38cc35fa253abb/datafusion/physical-plan/src/coalesce/mod.rs#L117-L116
Here are the explain plans for the query before and after the change
Here are the flamegraphs for the query before/after the change
The text was updated successfully, but these errors were encountered: