Description
Is your feature request related to a problem? Please describe.
When recent blocks are uploaded directly to s3, queriers cannot immediately query them because of the -querier.query-store-after. Which is most of the times linked to -querier.query-ingesters-within for performance reasons.
This leads to wrongful query results that last for querier.query-store-after.
Describe the solution you'd like
A tenant that just uploaded blocks to s3 should have a specific query-store-after that matches the time of the upload or the default query-store-after, whichever is recent. Maybe a new type of tenant override can do it, but I am not sure.
Plus, we should also probably free the cache for that specific tenant in memcached.(or Redis). This could also just be an internal API, something like /cache/flush (like /ingester/flush, that accepts tenants)
Flushing the cache can only happen after 2 things have happened:
- Compactors have updated the bucket index in s3 using -compactor.cleanup-interval (15m)
- Store-gateway have refreshed the bucket index -blocks-storage.bucket-store.sync-interval (15m)
So it's 30 minutes. But maybe we can also ignore cache for that tenant for the next 30 minutes. It can be more than 30 minutes, because compactor will take some time to download from s3 all the metas for the user. Maybe what is needed is an API path in compactor to trigger bucket-index update for the user. And a similar API path in store-gateway, queriers, rulers to trigger bucket index refresh.
Describe alternatives you've considered
- Current solution is restarting all caches and making querier.query-store-after=0 for a few hours for all tenants. That has performance implications for all tenants.
Additional context
None