Description
openedon Mar 11, 2022
Is your feature request related to a problem? Please describe.
Too many small chunks in S3, unable to be solved by the continued increase of idle timeout due to the huge memory increase that settings results in.
With some queries needing to fetch 90,000 chunks 50-100 big chunks, 89900+ smaller chunks these smaller chunks can be the bottleneck for many queries. Quite often these smaller chunks exist because their source has bursts of activity infrequently. It would be far more ideal if this was <1,000 good sized chunks (still enough to parallize over multiple cores) were queried instead (closer to the number of streams).
Describe the solution you'd like
A utility similar to compactor (or built in?) that is able to create new chunks by merging small chunks (i.e <10KB which is 95%+ of our dataset) that had been pushed due to idle period (but later there was matching data).
Fetching these chunks is particularly expensive and most of the time spent downloading chunks. It might also improve compression ratios (if blocks are rebuilt).
Placing this in compactor might be a good idea since the index is already being updated at this time.
This compactor should get a setting like sync_period
to bound the combine search. For most people this should be the same value as indexers sync_period
. Chunk max size would still need to be honoured of course. Larger chunks, not just one chunk.
Something like:
if chunk size < min_threshold:
for each chunk in index that also matches labels:
merge new chunk
if new chunk size > max_merge_amount or in new sync_period then
replace with new chunk
New chunks should be entirely new (new id) and old chunks removed index_cache_validity
after the index containing only the new chunks is updated (to prevent cached indexes from accessing the now non existant chunks).
If the chunk compactor exits uncleanly (or has any similar issue) unreferenced chunks may end up in the chunk store. AFAIK this is possible currently regardless and probably is a seperate matter.
Describe alternatives you've considered
Increasing chunk_idle_period
(currently 6m) further. 10m was tested however resulted in too much memory being consumed.