-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider range and offset in queries while looking for schema config for query sharding #7880
consider range and offset in queries while looking for schema config for query sharding #7880
Conversation
…for query sharding
6f196dc
to
8146a0e
Compare
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.1%
+ iter 0%
+ storage 0%
+ chunkenc 0%
- logql -0.4%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch :)
LGTM. Feel like we should lock this behavior with some tests?
return 0, 0, err | ||
} | ||
|
||
if _, ok := expr.(syntax.SampleExpr); !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this also changes the previous behavior of returning non-nil errors for LogSelectorExpr
but now it would return nil
error.
Do you think this would break anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is okay since previous usage was only in metric
query path and range
and offset
are only used in metric
queries.
if r, ok := e.(*syntax.LogRange); ok && r.Interval > max { | ||
max = r.Interval | ||
if r, ok := e.(*syntax.LogRange); ok { | ||
if r.Interval > maxRVDuration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: whats the point of comparing it with maxRvDuration
and maxOffset
. They have only zero values right? (from the above declarations)
why not just maxRVDuration, maxOffset = r.Internval, r.Offset
? Given, there can be only one LogRange
expression in a given LogQL.? (can LogQL contains multiple Range expression? I doubt it though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can have multiple of them with binary operations like rate({job="foo"} [1m] offset 5m) / rate({job="bar"} [1m] offset 15m)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! 👍
yeah, I am working on the tests. Didn't get to work on them due to having my hand in multiple things and would like to get this PR in this week's build. Will let you know when I am done with the tests. |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.4%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.4%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
…for query sharding (#7880) **What this PR does / why we need it**: We disable query sharding when a query touches multiple schema configs since they might have different sharding factors and might handle sharding differently. This check was done purely on the start and end time of the query without considering `range` and `offset`, which would change the length of the actual data being queried. Besides being incorrect, it also causes us to fail queries when migrating between tsdb and non-tsdb stores since both handle query sharding differently. For e.g., if the prev schema is `boltdb-shipper` and starting today, `tsdb` index is being used, so if we do a query `sum(rate({foo="bar"}[24h])` even with start and end within `tsdb` range, we will fail the query with error `incompatible index shard` if dynamic sharding goes with shard factor other than `32`(default shard factor in ingester for inverted index). The reason being the `range` here is `24h` which causes us to process data from previous schema as well. This PR takes care of the issue by factoring in `range` and `offset` in the queries while looking for schema config for query sharding. **Checklist** - [x] `CHANGELOG.md` updated.
What this PR does / why we need it:
We disable query sharding when a query touches multiple schema configs since they might have different sharding factors and might handle sharding differently. This check was done purely on the start and end time of the query without considering
range
andoffset
, which would change the length of the actual data being queried.Besides being incorrect, it also causes us to fail queries when migrating between tsdb and non-tsdb stores since both handle query sharding differently. For e.g., if the prev schema is
boltdb-shipper
and starting today,tsdb
index is being used, so if we do a querysum(rate({foo="bar"}[24h])
even with start and end withintsdb
range, we will fail the query with errorincompatible index shard
if dynamic sharding goes with shard factor other than32
(default shard factor in ingester for inverted index). The reason being therange
here is24h
which causes us to process data from previous schema as well.This PR takes care of the issue by factoring in
range
andoffset
in the queries while looking for schema config for query sharding.Checklist
CHANGELOG.md
updated