Skip to content

Increase domain compaction limits for hive and delta #18024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 28, 2023

Conversation

raunaqmorarka
Copy link
Member

Description

Increase domain compaction limits for hive and delta.
Makes the default limit consistent with iceberg.

Additional context and related issues

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Jun 23, 2023
@github-actions github-actions bot added delta-lake Delta Lake connector hive Hive connector tests:hive labels Jun 23, 2023
@raunaqmorarka raunaqmorarka requested review from findepi and sopel39 June 23, 2023 04:54
@sopel39
Copy link
Member

sopel39 commented Jun 26, 2023

please run benchmarks if 1000 doesn't cause regressions (I think it does).

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please run benchmarks

@raunaqmorarka
Copy link
Member Author

Increased domain threshold sf1k unpartitioned parquet.pdf

Added benchmark, it shows CPU increases in some places and decreases in other places. The differences are within the usual window of variability in results. I checked JFR profile for query with highest difference and there is nothing related to tuple domains in it.
We're already using 1000 threshold in iceberg and tuple domain/domain methods have been optimized enough that bigger thresholds should not be an issue.

@raunaqmorarka raunaqmorarka merged commit b9c751e into trinodb:master Jun 28, 2023
@raunaqmorarka raunaqmorarka deleted the increase-domain branch June 28, 2023 13:34
@github-actions github-actions bot added this to the 421 milestone Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector docs hive Hive connector
Development

Successfully merging this pull request may close these issues.

3 participants