Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query: introduce dynamic lookback interval #3277

Merged
merged 13 commits into from
Nov 6, 2020

Conversation

krya-kryak
Copy link
Contributor

@krya-kryak krya-kryak commented Oct 5, 2020

Closes #2608
This allows queries with large step to make use of downsampled data.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

  • Added CLI option for query

Verification

This patch was tested with 0.15 branch (minor conflict solved manually), works as expected

Signed-off-by: Vladimir Kononov krya-kryak@users.noreply.github.com

Closes thanos-io#2608
This allows queries with large step to make use of downsampled data.

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Thanks for this 👍

Overall to me, it makes sense, just some implementation suggestion etc. 🤗

NOTE: Let's reduce amount of engines we create - can we create just 3 engines (number of resolution we have) and just create promql.Engine wrapper that routes to correct one based on maxSourceResolution? 🤔

cmd/thanos/query.go Outdated Show resolved Hide resolved
totalEngines int
newEngine = func(ld time.Duration) *promql.Engine {
var r prometheus.Registerer = reg
if dynamicLookbackDelta {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow 😱 Makes sense, wonder if it's doable to contribute to Promtheus some tweak to allow us to avoid creating 3 engines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO for now

return defaultEvaluationInterval.Milliseconds()
}
totalEngines int
newEngine = func(ld time.Duration) *promql.Engine {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused, do we need two functions? What about engineFunc? I guess you wanted to have large number of args in function, but it might be cleaner to embed engineFunc here, not embedding this function in engineFunc. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, having two functions is confusing. Got some coffee, merged engineFunc and newEngine into single engineFactory. IMO things got a lot more unerstandable. I hope this isn't just coffee speaking.

for i, d := range deltas[1:] {
if ld < d {
// Convert delta from milliseconds to time.Duration (nanoseconds).
engines[i+1] = newEngine(time.Duration(d * 1_000_000))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks weird and should fail our lints: 1_000_000 - what about using timestamp.Time(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with standard time.Duration(d) * time.Millisecond. Prometheus'es timestamp seems unhelpful here, as it's Duration we need, not Time

cmd/thanos/query.go Outdated Show resolved Hide resolved
krya-kryak and others added 2 commits October 8, 2020 19:21
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
@krya-kryak
Copy link
Contributor Author

NOTE: Let's reduce amount of engines we create - can we create just 3 engines (number of resolution we have) and just create promql.Engine wrapper that routes to correct one based on maxSourceResolution? 🤔

Well, this seems to be more or less similar to how it is done already - we create at most three engines, but choose correct one with an anonymous function call (note pkg/api/query change) instead of a wrapper. It seemed like a less intrusive way to implement, please correct me if I'm wrong.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important question that came to my mind before going full for this solution (:

cmd/thanos/query.go Show resolved Hide resolved
@bwplotka
Copy link
Member

Wait, which one is the current fix / better? (: #3267 or this one?

@krya-kryak
Copy link
Contributor Author

krya-kryak commented Oct 16, 2020

Wait, which one is the current fix / better? (: #3267 or this one?

I really prefer this one over #3267 for several reasons:

  1. It doesn't mess with metrics staleness as much
  2. It's less heavy on resources
  3. It doesn't introduce rounding errors to query results

I'll just close that one.

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Some ideas for simplifying the code still, but otherwise LGTM (:

cmd/thanos/query.go Outdated Show resolved Hide resolved
cmd/thanos/query.go Outdated Show resolved Hide resolved
cmd/thanos/query.go Outdated Show resolved Hide resolved
totalEngines++
return wr
}
rawEngine := newEngine(promql.EngineOpts{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this and init all engines in exactly the same way (no matter if raw or not raw) (:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might simplify the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Done.

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks 👍

Sorry for delayed review 🤗

ld = eo.LookbackDelta.Milliseconds()
)
wrapReg := func(engineNum int) prometheus.Registerer {
return extprom.WrapRegistererWith(map[string]string{"engine": strconv.Itoa(engineNum)}, eo.Reg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just inline this (:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker (:

@bwplotka bwplotka merged commit f6e5d59 into thanos-io:master Nov 6, 2020
@krya-kryak krya-kryak deleted the dynamic-lookback-delta branch November 10, 2020 10:55
Oghenebrume50 pushed a commit to Oghenebrume50/thanos that referenced this pull request Dec 7, 2020
* query: introduce dynamic lookback interval

Closes thanos-io#2608
This allows queries with large step to make use of downsampled data.

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Fix minor checks

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Append changelog

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Add missing copyright

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Use pre-defined downsampling resolution constatns

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Use dynamic lookback delta by default

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Rename defaultEngine to rawEngine

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Merge engineFunc and newEngine into single engineFactory

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Remove query.dynamic-lookback-delta from docs

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Rename test

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

* Review fixes

Signed-off-by: Vladimir Kononov <krya-kryak@users.noreply.github.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Oghenebrume50 <raphlbrume@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

query: Staleness problem
2 participants