-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Hide this error message when ctx timeout occurs in s3.getObject #6360
Conversation
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.3%
+ chunkenc 0%
+ logql 0%
+ loki 0.7% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I see what the fix does, but could you provide some information in the PR description on how the bug manifests in operations?
Probably also worth a changelog entry, right?
pkg/storage/batch.go
Outdated
if ctx.Err() != nil { | ||
errChan <- nil | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ctx.Err()
only != nil
if err != nil
? Otherwise it would make sense to move it out of the if err != nil {}
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to your tips, this PR code is clearer.
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second to @chaudum a brief description of the problem this solves would help me as well. I'm currently understanding that since we retry, we don't want to show intermittent timeouts, and we only care about the last error. Is that correct?
Hi! This issue has been automatically marked as stale because it has not had any We use a stalebot among other tools to help manage the state of issues in this project. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a We may also:
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, |
@liguozhong Can you rebase this fix and add some description to the PR? |
done. |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.1%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -0.2%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…rafana#6360) **What this PR does / why we need it**: ``` level=error ts=2022-01-24T10:57:05.273404635Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273434964Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=warn ts=2022-01-24T10:57:05.273457211Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=16.524312361s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" level=error ts=2022-01-24T10:57:05.273486368Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: context canceled" level=error ts=2022-01-24T10:57:05.273786914Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=error ts=2022-01-24T10:57:05.274246531Z caller=batch.go:699 msg="error fetching chunks" err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" level=warn ts=2022-01-24T10:57:05.274385302Z caller=grpc_logging.go:55 method=/logproto.Querier/Query duration=29.542213769s err="failed to get s3 object: RequestCanceled: request context canceled\ncaused by: context canceled" msg="gRPC\n" ``` When logql timeout, do not print these logs. Since we retry, we don't want to show intermittent timeouts, and we only care about the last error. **Which issue(s) this PR fixes**: Fixes [<5221>](grafana#5221)
What this PR does / why we need it:
When logql timeout, do not print these logs.
Which issue(s) this PR fixes:
Fixes <5221>
Special notes for your reviewer:
Checklist
CHANGELOG.md
.docs/sources/upgrading/_index.md