-
Notifications
You must be signed in to change notification settings - Fork 820
Add querier.ingester-query-max-attempts to retry on partial data. #6714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add querier.ingester-query-max-attempts to retry on partial data. #6714
Conversation
Signed-off-by: Justin Jung <jungjust@amazon.com>
Signed-off-by: Justin Jung <jungjust@amazon.com>
Signed-off-by: Justin Jung <jungjust@amazon.com>
Signed-off-by: Justin Jung <jungjust@amazon.com>
85948b6
to
761741f
Compare
Signed-off-by: Justin Jung <jungjust@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should document clearly and provide some suggestions on what's the recommended set up for this retry value.
To me this flag is kind of overlapped with partial data and I am not sure how much the retry helps for most of the usecase.
We know it is unlikely to miss series so we return partial data with 4XX. The retry may succeed but given we only wait short period time and Ingester queries are usually within ms, I am unsure if it is worth it to retry more espeically if Ingesters are high load or ongoing deployment
Do you think the following description for the config is sufficient?
What I noticed with partial data is that the status code is 200 (such that the result is actually returned to the customer), but it throws a warning along with the results. So, when there is a transient network issue between a querier and ingesters, the customer could get partial data response instead of the query path retrying to get the full response (current retry logic from query frontend only reties 5xx responses, so partial data response does not get retried at all). |
Signed-off-by: Justin Jung <jungjust@amazon.com>
a6d086a
to
3f91a4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We can merge after fixing the changelog conflict
Signed-off-by: Justin Jung <jungjust@amazon.com>
What this PR does:
In #6526, new configs
query_partial_data
andrules_partial_data
were added which allows tenants to receive 2xx with a warning message when the data accuracy is relatively high in zone-aware setting.This PR adds retry logic in querier getting data from ingesters, retrying the requests if the response is partial data. The new configuration,
querier.ingester-query-max-attempts
, allows ingester queries to be retried. Default is set to 1.Which issue(s) this PR fixes:
n/a
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]