-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt](parquet) change parquet init footer read size to 48KB #46904
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
run buildall |
TPC-H: Total hot run time: 32614 ms
|
TPC-DS: Total hot run time: 194500 ms
|
ClickBench: Total hot run time: 32.11 s
|
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
### What problem does this PR solve? Change the initial footer read size from 128KB to 48KB, to slightly reduce the read size. This is same as presto/trino, because typically, a 1GB parquet file usually has footer with size 30~40KB. And usercase shows when there are 30 thousands parquet file, the parse footer time can reduce from: ``` ParseFooterTime: avg 2s28ms, max 3s707ms, min 905.866ms ``` to ``` ParseFooterTime: avg 886.364ms, max 1s734ms, min 391.846ms ```
…6904) ### What problem does this PR solve? Change the initial footer read size from 128KB to 48KB, to slightly reduce the read size. This is same as presto/trino, because typically, a 1GB parquet file usually has footer with size 30~40KB. And usercase shows when there are 30 thousands parquet file, the parse footer time can reduce from: ``` ParseFooterTime: avg 2s28ms, max 3s707ms, min 905.866ms ``` to ``` ParseFooterTime: avg 886.364ms, max 1s734ms, min 391.846ms ```
What problem does this PR solve?
Change the initial footer read size from 128KB to 48KB, to slightly reduce the read size.
This is same as presto/trino, because typically, a 1GB parquet file usually has footer with size 30~40KB.
And usercase shows when there are 30 thousands parquet file, the parse footer time can reduce from:
to
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)