-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compact: Offline deduplication #1014
Comments
Hi 👋 thanks for raising this. I rename title to make it more clear, let me know if this makes sense. There was always idea like this to allow compactor to deduplicate blocks offline and reduce the storage and bit the query performance. I think I am happy to do so however, we need ourselves question, is the curret deduplication algorithm a correct one for everybody? We have seen some reports claiming that our "penalty-based" deduplication algorithm is not working for all edge cases e.g: #981 It's fair as our alghorithm is very basic. The problem with this feature and not perfect algorithm is that we will deduplicate data irreversibily (unless we back up those blocks somehow). Anyway, I think we are happy to add such feature to compactor at some point, given we can solve the gaps, for current algorithm (and add more tests/ test cases maybe). |
Thanks for the reply @bwplotka. I'm glad to hear that you also want to support this feature in the future. For the offline dedup function, we have already started to build it inside the Thanos compactor component. If needed, we are more than happy to contribute it back to Thanos community. Please feel free to let me know your thoughts as well. |
Of course! PRs would be welcome, especially if you have something proven from your prod (: |
Are you on our slack BTW? (: |
Which slack channel are your pointing here? I am only be able to see the |
I can see you in all #thanos #thanos-dev channels (: |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Remove "stale", that's an interesting feature |
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
Stale bot, go away for now ;p
We are quite close!
@metalmatze is working on it and I am refactoring TSDB to allow us to do it
via TSDB code.
…On Thu, 27 Feb 2020 at 11:14, stale[bot] ***@***.***> wrote:
This issue/PR has been automatically marked as stale because it has not
had recent activity. Please comment on status otherwise the issue will be
closed in a week. Thank you for your contributions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1014?email_source=notifications&email_token=ABVA3OYQGYMEUPXR4UEUVETRE6OCZA5CNFSM4HEB6SW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEND65VY#issuecomment-591916759>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVA3OZ6KTITQNE6K3DWWMDRE6OCZANCNFSM4HEB6SWQ>
.
|
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
Can someone add the label for stalebot to ignore this issue? |
👋 We are adding new improvements to runtime deduplication so we can safely use it: #2890 This time, improving penalty algorithm. There might be one tricky part about counter reset. In Prometheus there is not metric type information yet, so we are discussing online how we can add that thanks to recent metadata changes to block (cc @pracucci @brian-brazil). If not we will need to guess it from data which needs more testing. In the mean time we could potentially allow deduplication with backup, so you still back up blocks in some remote location (another cold storage bucket), so we can revert things if needed. Without back up I would not be confident to allow offline dedup for our Thanos users right now, before we handle those missing bits (: |
There's some initial discussions about type metadata for remote write via the WAL, getting them into blocks is a completely different kettle of fish. |
Also Chunk iterator work is almost done which will be needed for dedup during compaction. Yea agree, super unlikely for now. |
Hello 👋 Looks like there was no activity on this issue for last 30 days. |
Still important |
Hi @bwplotka , interested in this issue, is there anyway I can contribute on this ? |
Also interested very much. Please update the status. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Still interested. |
👋🏽 Recently got many questions on DM, please ask them here around offline dedup (: Some update: We added more docs hopefully cleaning WHY and HOW: https://thanos.io/tip/components/compact.md/#vertical-compactions Answering one DM:
It's NOT ok, unfortunately. Vertical compaction is implemented for If you use this against Prometheus replicas it will most likely totally mess your querying experience as it just concatenates samples together. So scrape interval in best case is 2x of original, in the worst case it's totally unstable. The missing part is adding the deduplication algorithm we have online on the query to the compaction stage so we can leverage that. This algorithm also works on 99% of cases which is fine for the query part, when you can just switch deduplication off. If this 1% happens on offline dedup, then you cannot revert this, that's a problem. We are exploring different deduplication algorithms that will make this much more reliable. We also need something ideally for Query pushdown: https://docs.google.com/document/d/1ajMPwVJYnedvQ1uJNW2GDBY6Q91rKAw3kgA5fykIRrg/edit# so ... help wanted (: We can try to enable 99% |
Thank you @bwplotka for answering that question and confirming my suspicions. I'm going to look at the doc and related work and see if this is something I can help out with :) We would greatly benefit from a feature like this to dedupe the duplicated data coming from each Prometheus in the HA pair. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
This issue is still needed. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Please keep this open. |
I took a look at the querier's deduplication and compaction process and have some questions. :) For example, we have "main" and "replica" instances of Prometheus. They could start asynchronously and as a result, it's possible to have next blocks in s3: And of course, such overlapping blocks could be much more. So the first question is - how should work planner and compactor? The only option I see is that planner makes 2 groups like:
To take from block3 only parts which complement "main" blocks. But such behavior sound to be a bit sophisticated - maybe there are better options/thoughts about this thing? :) |
@2nick I am looking at the same thing. It is a little bit sophisticated and I think the grouping approach in #1276 is similar to what you mentioned. IMO the grouping part of offline deduplication can reuse the existing tsdb planner. We don't need to group into 2 groups in this case. We can do it in 2 iterations:
WDYT? But I agree this is still a not very good approach. It would cost a lot if the overlap is small, but we still need to iterate the whole block to compact. |
After some thinking I've decided that it's "good enough" approach as it allows to move forward. :) Really great that you are in it! Looking forward to offline dedup! :) |
Currently Thanos provides dedup function in Query component, it can merge the data on the fly from Prometheus HA pairs or remote object storage.
However, the Query dedup function is not query efficiently, because the replica's blocks in tsdb/object storage are not really merged so that the Query API has to load duplicated data in each request.
With current Prometheus TSDB design, it seems difficult to implement dedup function to merge blocks in each TSDB node on Thanos side.
However, it should be easily to implement it for the object storage as it is one centralized storage, and also we already have one Compactor component running on it.
Offering dedup function on object storage side will definitely help fast metrics query latency and reduce the object storage cost.
Do we have any plan to support such requirements in Thanos? If had, I would like to know your ideas about this feature.
Thanks.
The text was updated successfully, but these errors were encountered: