Skip to content

compression: add an adaptive compressor #4937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 24, 2025

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Jun 23, 2025

Informs #4925

compression: return Setting from Compress

Flip-flopping on my earlier change :)

ewma: add a per-bye EWMA estimator

This will be used to estimate compression ratio of blocks based on the
compression ratios of recently sampled blocks.

compression: add an adaptive compressor

Add AdaptiveCompressor, which estimates the size reduction of using a
slower algorithm vs a faster one and chooses automatically (on a
per-block basis).

We will separately update the compression analyzer to run experiments
with the adaptive compressors.

compressionanalyzer: support adaptive compressors

Switch experiments from using compression.Setting to
block.CompressionProfile and add two adaptive profiles.

@RaduBerinde RaduBerinde requested a review from jbowens June 23, 2025 21:31
@RaduBerinde RaduBerinde requested a review from a team as a code owner June 23, 2025 21:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde RaduBerinde force-pushed the adaptive-compressor branch from 02a5fe9 to e434ab4 Compare June 23, 2025 21:33
This will be used to estimate compression ratio of blocks based on the
compression ratios of recently sampled blocks.
Add AdaptiveCompressor, which estimates the size reduction of using a
slower algorithm vs a faster one and chooses automatically (on a
per-block basis).

We will separately update the compression analyzer to run experiments
with the adaptive compressors.
@RaduBerinde RaduBerinde force-pushed the adaptive-compressor branch from 1f75a45 to bedb5de Compare June 24, 2025 15:28
Copy link
Contributor

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: cool!

Reviewed 10 of 10 files at r1, 2 of 2 files at r2, 4 of 4 files at r3, 8 of 8 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @jbowens)

@RaduBerinde
Copy link
Member Author

TFTR! Some initial tests on TPCC data show it is a pretty decent trade-off between MinLZ and Zstd1, especially in decompression speed. This is for ~32KB data blocks, broken down by the "compressibility" of the block (as determined by MinLZ). The cutoff is set to 30%:

 Test CR           MinLZ1            Zstd1             Auto1

 <1.1      CR      1.02 ± 3%         1.41 ± 3%         1.28 ± 15%
           Comp    3633MBps ± 86%    364MBps ± 19%     533MBps ± 59%
           Decomp  19072MBps ± 84%   1055MBps ± 27%    2002MBps ± 71%

 1.1-1.5   CR      1.31 ± 8%         1.56 ± 9%         1.31 ± 8%
           Comp    1194MBps ± 24%    370MBps ± 15%     903MBps ± 77%
           Decomp  5765MBps ± 31%    980MBps ± 16%     5710MBps ± 32%

 1.5-2.5   CR      1.75 ± 12%        2.17 ± 15%        1.75 ± 13%
           Comp    1254MBps ± 42%    427MBps ± 27%     977MBps ± 80%
           Decomp  3664MBps ± 42%    1010MBps ± 28%    3672MBps ± 44%

 >2.5      CR      5.71 ± 82%        9.42 ± 66%        8.30 ± 70%
           Comp    1719MBps ± 46%    578MBps ± 26%     799MBps ± 57%
           Decomp  3509MBps ± 40%    1287MBps ± 34%    1979MBps ± 59%

Switch experiments from using `compression.Setting` to
`block.CompressionProfile` and add two adaptive profiles.
@RaduBerinde RaduBerinde force-pushed the adaptive-compressor branch from bedb5de to bbb055b Compare June 24, 2025 16:20
@RaduBerinde RaduBerinde merged commit f4e9b0c into cockroachdb:master Jun 24, 2025
6 checks passed
@RaduBerinde RaduBerinde deleted the adaptive-compressor branch June 24, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants