Add support for merging compressed chunks #7669

erimatnor · 2025-02-07T15:19:26Z

Compressed chunks can be merged by applying the same copy+heap swap technique to the internal compressed chunks as applied to non-compressed chunks. However, it is necessary to consider cases when some chunks are compressed and some are not.

The way this is handled is to pick the first compressed chunk found among the input chunks, and then use that as the "result" chunk. In the next step all the chunks' non-compressed heaps are merged followed by all the "internal" compressed heaps. In the last step, the result chunk has its non-compressed and compressed heaps swapped with the merged ones, respectively.

In all other regards, the merging works the same as before when merging non-compressed chunks.

codecov · 2025-02-07T15:34:18Z

Codecov Report

Attention: Patch coverage is 91.34615% with 9 lines in your changes missing coverage. Please review.

Project coverage is 81.92%. Comparing base (59f50f2) to head (7830006).
Report is 810 commits behind head on main.

Files with missing lines	Patch %	Lines
tsl/src/chunk.c	91.34%	3 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7669      +/-   ##
==========================================
+ Coverage   80.06%   81.92%   +1.85%     
==========================================
  Files         190      247      +57     
  Lines       37181    45522    +8341     
  Branches     9450    11384    +1934     
==========================================
+ Hits        29770    37294    +7524     
- Misses       2997     3754     +757     
- Partials     4414     4474      +60

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mkindahl

A few minor comment, and I am missing a few tests (or I am not finding them):

Merging hypercore chunks into heap chunks (both compressed)
Testing all combinations of merging uncompressed, partially compressed, and fully compressed.

tsl/src/chunk.c

tsl/test/expected/merge_chunks.out

Compressed chunks can be merged by applying the same copy+heap swap technique to the internal compressed heaps as done for the non-compressed heaps. However, special consideration is necessary for cases where some chunks are compressed and some are not. The way this is handled is to pick the first compressed chunk found among the input chunks, and then use that as the "result" chunk. In the next step all the chunks' non-compressed heaps are merged followed by all the "internal" compressed heaps. In the last step, the result chunk has its non-compressed and compressed heaps swapped with the merged ones, respectively. In all other regards, the merging works the same as before when merging non-compressed chunks.

@bjornuppeke

## 2.19.0 (2025-03-12) This release contains performance improvements and bug fixes since the 2.18.2 release. We recommend that you upgrade at the next available opportunity. **Features** * [#7586](#7586) Vectorized aggregation with grouping by a single text column. * [#7632](#7632) Optimize recompression for chunks without segmentby * [#7655](#7655) Support vectorized aggregation on Hypercore TAM * [#7669](#7669) Add support for merging compressed chunks * [#7701](#7701) Implement a custom compression algorithm for bool columns. It is experimental and can undergo backwards-incompatible changes. For testing, enable it using timescaledb.enable_bool_compression = on. * [#7707](#7707) Support ALTER COLUMN SET NOT NULL on compressed chunks * [#7765](#7765) Allow tsdb as alias for timescaledb in WITH and SET clauses * [#7786](#7786) Show warning for inefficient compress_chunk_time_interval configuration * [#7788](#7788) Add callback to mem_guard for background workers * [#7789](#7789) Do not recompress segmentwise when default order by is empty * [#7790](#7790) Add configurable Incremental CAgg Refresh Policy **Bugfixes** * [#7665](#7665) Block merging of frozen chunks * [#7673](#7673) Don't abort additional INSERTs when hitting first conflict * [#7714](#7714) Fixes a wrong result when compressed NULL values were confused with default values. This happened in very special circumstances with alter table added a new column with a default value, an update and compression in a very particular order. * [#7747](#7747) Block TAM rewrites with incompatible GUC setting * [#7748](#7748) Crash in the segmentwise recompression * [#7764](#7764) Fix compression settings handling in Hypercore TAM * [#7768](#7768) Remove costing index scan of hypertable parent * [#7799](#7799) Handle DEFAULT table access name in ALTER TABLE **Thanks** * @bjornuppeke for reporting a problem with INSERT INTO ... ON CONFLICT DO NOTHING on compressed chunks * @kav23alex for reporting a segmentation fault on ALTER TABLE with DEFAULT Signed-off-by: Philip Krauss <35487337+philkra@users.noreply.github.com>

@bjornuppeke

## 2.19.0 (2025-03-18) This release contains performance improvements and bug fixes since the 2.18.2 release. We recommend that you upgrade at the next available opportunity. * Improved concurrency of INSERT, UPDATE and DELETE operations on the columnstore by no longer blocking DML statements during the recompression of a chunk. * Improved system performance during Continuous Aggregates refreshes by breaking them into smaller batches which reduces systems pressure and minimizes the risk of spilling to disk. * Faster and more up-to-date results for queries against Continuous Aggregates by materializing the most recent data first (vs old data first in prior versions). * Faster analytical queries with SIMD vectorization of aggregations over text columns and group by over multiple column * Enable optimizing chunk size for faster query performance on the columnstore by adding support for merging columnstore chunks to the merge_chunk API. **Deprecation warning** This is the last minor release supporting PostgreSQL 14. Starting with the minor version of TimescaleDB only Postgres 15, 16 and 17 will be supported. **Downgrading of 2.19.0** This release introduces custom bool compression, if you enable this feature via the `enable_bool_compression` and must downgrade to a previous, please use the [following script](https://github.com/timescale/timescaledb-extras/blob/master/utils/2.19.0-downgrade_new_compression_algorithms.sql) to convert the columns back to their previous state. TimescaleDB versions prior to 2.19.0 do not know how to handle this new type. **Features** * [#7586](#7586) Vectorized aggregation with grouping by a single text column. * [#7632](#7632) Optimize recompression for chunks without segmentby * [#7655](#7655) Support vectorized aggregation on Hypercore TAM * [#7669](#7669) Add support for merging compressed chunks * [#7701](#7701) Implement a custom compression algorithm for bool columns. It is experimental and can undergo backwards-incompatible changes. For testing, enable it using timescaledb.enable_bool_compression = on. * [#7707](#7707) Support ALTER COLUMN SET NOT NULL on compressed chunks * [#7765](#7765) Allow tsdb as alias for timescaledb in WITH and SET clauses * [#7786](#7786) Show warning for inefficient compress_chunk_time_interval configuration * [#7788](#7788) Add callback to mem_guard for background workers * [#7789](#7789) Do not recompress segmentwise when default order by is empty * [#7790](#7790) Add configurable Incremental CAgg Refresh Policy **Bugfixes** * [#7665](#7665) Block merging of frozen chunks * [#7673](#7673) Don't abort additional INSERTs when hitting first conflict * [#7714](#7714) Fixes a wrong result when compressed NULL values were confused with default values. This happened in very special circumstances with alter table added a new column with a default value, an update and compression in a very particular order. * [#7747](#7747) Block TAM rewrites with incompatible GUC setting * [#7748](#7748) Crash in the segmentwise recompression * [#7764](#7764) Fix compression settings handling in Hypercore TAM * [#7768](#7768) Remove costing index scan of hypertable parent * [#7799](#7799) Handle DEFAULT table access name in ALTER TABLE **GUCs** * `enable_bool_compression`: enable the BOOL compression algorithm, default: `OFF` * `enable_exclusive_locking_recompression`: enable exclusive locking during recompression (legacy mode), default: `OFF` **Thanks** * @bjornuppeke for reporting a problem with INSERT INTO ... ON CONFLICT DO NOTHING on compressed chunks * @kav23alex for reporting a segmentation fault on ALTER TABLE with DEFAULT --------- Signed-off-by: Philip Krauss <35487337+philkra@users.noreply.github.com> Signed-off-by: Ramon Guiu <ramon@timescale.com> Co-authored-by: Ramon Guiu <ramon@timescale.com>

erimatnor added the split/merge label Feb 7, 2025

github-actions bot assigned erimatnor Feb 7, 2025

erimatnor force-pushed the merge-chunks-compressed branch from 87d3db9 to 1aa10b2 Compare February 7, 2025 15:23

erimatnor force-pushed the merge-chunks-compressed branch 2 times, most recently from 3b2bca9 to 2cfb52f Compare February 7, 2025 15:54

erimatnor force-pushed the merge-chunks-compressed branch from 2cfb52f to bc91a59 Compare February 19, 2025 09:34

erimatnor marked this pull request as ready for review February 19, 2025 13:05

erimatnor requested review from fabriziomello and mkindahl February 19, 2025 13:05

erimatnor force-pushed the merge-chunks-compressed branch from bc91a59 to 702d543 Compare February 21, 2025 08:08

erimatnor added this to the v2.19.0 milestone Mar 4, 2025

mkindahl reviewed Mar 6, 2025

View reviewed changes

tsl/src/chunk.c Outdated Show resolved Hide resolved

tsl/src/chunk.c Show resolved Hide resolved

tsl/test/expected/merge_chunks.out Show resolved Hide resolved

erimatnor force-pushed the merge-chunks-compressed branch 2 times, most recently from 3adb26a to 791638c Compare March 10, 2025 12:53

erimatnor requested a review from mkindahl March 10, 2025 12:53

erimatnor force-pushed the merge-chunks-compressed branch from 791638c to 084292e Compare March 10, 2025 13:13

erimatnor force-pushed the merge-chunks-compressed branch from 084292e to 7830006 Compare March 10, 2025 13:35

mkindahl approved these changes Mar 10, 2025

View reviewed changes

fabriziomello approved these changes Mar 10, 2025

View reviewed changes

erimatnor enabled auto-merge (rebase) March 10, 2025 15:10

erimatnor merged commit b4107b2 into timescale:main Mar 10, 2025
51 of 53 checks passed

This was referenced Mar 12, 2025

CHANGELOG for 2.19.0 #7824

Closed

CHANGELOG for 2.19.0 #7829

Merged

bayandin mentioned this pull request Mar 21, 2025

timescaledb 2.19.0 bayandin/homebrew-tap#255

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for merging compressed chunks #7669

Add support for merging compressed chunks #7669

erimatnor commented Feb 7, 2025 •

edited

Loading

codecov bot commented Feb 7, 2025 •

edited

Loading

mkindahl left a comment

Add support for merging compressed chunks #7669

Add support for merging compressed chunks #7669

Conversation

erimatnor commented Feb 7, 2025 • edited Loading

codecov bot commented Feb 7, 2025 • edited Loading

Codecov Report

mkindahl left a comment

Choose a reason for hiding this comment

erimatnor commented Feb 7, 2025 •

edited

Loading

codecov bot commented Feb 7, 2025 •

edited

Loading