-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compact: Replace group with resolution in compact metrics. #6049
Conversation
65f9dd2
to
dbbd535
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes make sense to me. Thanks. We just need to fix the test
Yea, I worked on the tests a bit, but ran out of time this week. Will see if I can figure out the rest soon. |
140e6ab
to
acf062d
Compare
acf062d
to
0fe838a
Compare
Ok, I updated this to fix the tests and also update the compact dashboards. |
0fe838a
to
e2a901e
Compare
Is the CircleCI failure related to this pr? It would be great if you could take a look |
Yes, I realized I need to refactor all the tests to use the new resolution label, not group. Will work on it soon. |
a24ce81
to
66a9003
Compare
Ok, much closer on the updates to the tests. |
@yeya24 Any idea if the |
66a9003
to
b3b4b0e
Compare
@yeya24 I think the e2e test is flaky, this seems unrelated:
|
b3b4b0e
to
5c8bb5e
Compare
Ping @yeya24, tests are passing now. |
b08a8bf
to
522fe2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I will let other maintainers take another look before merging as it is a breaking change.
Yes, this would be great to get in before the next release. I have one cluster with 90,000 metrics on the compact that is nearly useless to query. Plus others with 10k+. |
Maybe @metalmatze or @GiedriusS ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, so generally the groups are useful, as they allow checking each group progress. However indeed for large numbers of streams (e.g. 1k+) this gets troubling. I think it's fair to fix the cardinality explosion problem ASAP and merge this, just I see users with smaller number of groups who liked previous approach. I think we can follow up with opt-in grouping later on if needed 👍🏽
The opt-in would require a bit more work on instrumentation side, but could be safer and easier for users. |
Compaction metrics have too high a cardinality, causing metric bloat on large installations. The group information is better suited to logs. * Replace with a `resolution` label to the compaction counters. Fixes: thanos-io#5841 Signed-off-by: SuperQ <superq@gmail.com>
522fe2e
to
7eeb1bc
Compare
IMO, metrics are the wrong place to expose group level progress information. A progress API / dashboard UI would be better. |
Can this be merged? |
Changes
Compaction metrics have too high a cardinality, causing metric bloat on
large installations. The group information is better suited to logs.
resolution
label to the compaction counters.Fixes: #5841
Verification