-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip blocks with out-of-order chunk during compaction #4469
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! Super high-quality code. Thanks for the contribution @huyan0
|
||
if i.OutOfOrderSeries > 0 { | ||
errMsg = append(errMsg, fmt.Sprintf( | ||
func (i HealthStats) OutOfOrderChunksErr() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, do you think we can just skip compaction for all the CriticalErr; instead of just OutOfOrderChunksErr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay on the compact side given we cannot generate more blocks than not compacting...I don't think there's much implication on the retrieval side either but would like to hear what @yeya24 thinks : )
@yeya24 do you have some context on why the project decided to halt compaction in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwplotka do you have some context on the discussion? :)
887afae
to
0532597
Compare
I have updated the metric counter, will add to changelog if there's no further comments for now @yeya24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr LGTM. But it changes the default behavior and might be critical so it would be good to hear from other maintainers.
cc @thanos-io/thanos-maintainers
The code looks solid and you referenced an issue. If I understand correctly, we now mark it as corrupted and skip it. That still leaves a block corrupted. Are there any follow-up steps that a user could do? Basically I would really love to have some documentation on:
This would IMO be the cherry on top :) |
TODO: Can we check if those blocks are readable by Thanos Store/Cortex Store |
TODO: Review & check for consistency with other issues and how we handle them (: |
@bwplotka regarding your two comments:
|
PR still pending review :) Any updates needed? |
Btw let's update the changelog and we are ready to go I think |
@huyan0 Sorry for the late review. I will merge the pr after you resolve the conflict. |
hello, I have rebased : ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Looks like the E2E is still failing and caused by this pr.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be other panics caused by that deletion marker metric. Please check and fix them. Thanks for the patience.
Can you do a rebase on latest main rather than including changes from another pr. |
b4405a2
to
18efebe
Compare
2abf49b
to
b0fbe2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
looking... Sorry for the back and forth. Will try to address it |
Signed-off-by: Yang Hu <yhuz@amazon.com>
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the contribution.
Signed-off-by: Yang Hu yhuz@amazon.com
Changes
Verification
compact_e2e_test.go
Addresses: #3442
cc: @yeya24 @alvinlin123 @roystchiang