Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact: data corruption during downsapmle, test and fix. #6598

Merged
merged 10 commits into from
Sep 12, 2023

Conversation

xBazilio
Copy link
Contributor

@xBazilio xBazilio commented Aug 9, 2023

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
@xBazilio
Copy link
Contributor Author

xBazilio commented Aug 9, 2023

We have this problem with data corruption after downsamling.
The usual error is "invalid size" from AggrChunk.Get
Sometimes we get "sum and count timestamps not aligned".

All errors come from downsampled blocks. It doesnt matter if they are 300s or 3600s block. It also doesn't matter if they where produced before we turned on vertical deduplication or after.
It goes across multiple thanos versions.

All I could find is that some samples contain NormalNaN values. If I remove those - data is valid.

Code in downsample package can account for StaleNaN values, but it ignores NormalNan.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this thorough investigation 🙇 I wonder how these NaNs could appear in the first place. Perhaps it would be worth adding some downsampling tests on blocks that only contain NaNs, for example? Would it be a lot of work? 🤔

@xBazilio
Copy link
Contributor Author

I don't know how NormalNaNs got there, but they already were in compacted raw block. We have alot of metrics from thousands of targets, rules, alerts etc.

@GiedriusS what would you like to see in the test? Downsampling raw series with only NormalNaN values? With my fix it would get us an emty series.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
@@ -282,6 +962,25 @@ func TestDownsample(t *testing.T) {

expected: realisticChkDataWithCounterResetRes5m,
},
{
name: "three chunks, the first one with NaN values only",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GiedriusS I added a test for sample with NaN values only as you requested.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
batch := data[:j]
batch := make([]sample, 0, j)
for _, s := range data[:j] {
if value.IsStaleNaN(s.v) || math.Float64bits(s.v) == value.NormalNaN {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should filter those NaNs beforehand... So that we can calculate batchSize, windowTime, etc better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/thanos-io/thanos/blob/main/pkg/compact/downsample/downsample.go#L648 the ApplyCounterResetsSeriesIterator ignores NaN values but https://github.com/thanos-io/thanos/blob/main/pkg/compact/downsample/downsample.go#L474 expandChunkIterator only ignores stale NaN.

I am wondering if we should just change this. We have so many places in the code that allows normal NaN and ignores just stale NaN. Not sure what's the correct behavior here. Should we keep the normal NaN? @bwplotka

yeya24
yeya24 previously approved these changes Aug 17, 2023
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this change LGTM. We need to fix changelog. I'd also love to see this change in the new release. THanks!

@yeya24
Copy link
Contributor

yeya24 commented Aug 17, 2023

I am wondering if we will have a big gap in case we have continuous NaNs in samples. Not sure if we are able to handle that correctly or not

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
@xBazilio
Copy link
Contributor Author

Fixed CHANGELOG.

@yeya24
Copy link
Contributor

yeya24 commented Aug 18, 2023

I still need another approval to merge this.
I am not 100% sure if we really want to skip normal NaN samples in downsampled chunks. But I understand it can causes issues while downsampling to 1h chunks.
I am thinking that another way is to still keep NaN samples but we skip it when calculating aggregated values. Like we increament count but skip sum, counter, etc

@GiedriusS
Copy link
Member

Overall this change LGTM. We need to fix changelog. I'd also love to see this change in the new release. THanks!

I would say let's wait until the next release because 0.32.0 is already huge. Compactor is a sensitive component which requires extra attention in reviewing code.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
@@ -374,7 +374,7 @@ func downsampleRawLoop(data []sample, resolution int64, numChunks int) []chunks.

batch := make([]sample, 0, j)
for _, s := range data[:j] {
if value.IsStaleNaN(s.v) || math.Float64bits(s.v) == value.NormalNaN {
if value.IsStaleNaN(s.v) || math.Float64bits(s.v) == value.NormalNaN || math.IsNaN(s.v) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this? I thought math.IsNaN includes both StaleNaN and NormalNaN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. and added math.NaN() to test

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
Copy link
Contributor

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@yeya24
Copy link
Contributor

yeya24 commented Sep 12, 2023

Cool, thanks I will merge this pr. Thanks for the fix @xBazilio

@yeya24 yeya24 merged commit 2ec9b0c into thanos-io:main Sep 12, 2023
13 of 14 checks passed
@verejoel
Copy link
Contributor

Just tested this and ran into the invalid size error again:

ts=2023-09-12T06:34:08.700310375Z caller=intrumentation.go:67 level=warn msg="changing probe status" status=not-ready reason="error executing compaction: first pass of downsampling failed: downsampling to 60 min: downsample block 01H9FB4YMTT0BNKVRVQKHNRXDF to window 3600000: downsample aggregate block, series: 1899: invalid size"
ts=2023-09-12T06:34:08.700344075Z caller=http.go:91 level=info service=http/server component=compact msg="internal server is shutting down" err="error executing compaction: first pass of downsampling failed: downsampling to 60 min: downsample block 01H9FB4YMTT0BNKVRVQKHNRXDF to window 3600000: downsample aggregate block, series: 1899: invalid size"
ts=2023-09-12T06:34:08.700436277Z caller=http.go:110 level=info service=http/server component=compact msg="internal server is shutdown gracefully" err="error executing compaction: first pass of downsampling failed: downsampling to 60 min: downsample block 01H9FB4YMTT0BNKVRVQKHNRXDF to window 3600000: downsample aggregate block, series: 1899: invalid size"
ts=2023-09-12T06:34:08.700511379Z caller=intrumentation.go:81 level=info msg="changing probe status" status=not-healthy reason="error executing compaction: first pass of downsampling failed: downsampling to 60 min: downsample block 01H9FB4YMTT0BNKVRVQKHNRXDF to window 3600000: downsample aggregate block, series: 1899: invalid size"
ts=2023-09-12T06:34:08.700655882Z caller=main.go:161 level=error err="downsampling to 60 min: downsample block 01H9FB4YMTT0BNKVRVQKHNRXDF to window 3600000: downsample aggregate block, series: 1899: invalid size\nfirst pass of downsampling failed\nmain.runCompact.func7\n\t/app/cmd/thanos/compact.go:445\nmain.runCompact.func8.1\n\t/app/cmd/thanos/compact.go:481\ngithub.com/thanos-io/thanos/pkg/runutil.Repeat\n\t/app/pkg/runutil/runutil.go:74\nmain.runCompact.func8\n\t/app/cmd/thanos/compact.go:480\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\nerror executing compaction\nmain.runCompact.func8.1\n\t/app/cmd/thanos/compact.go:508\ngithub.com/thanos-io/thanos/pkg/runutil.Repeat\n\t/app/pkg/runutil/runutil.go:74\nmain.runCompact.func8\n\t/app/cmd/thanos/compact.go:480\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\ncompact command failed\nmain.main\n\t/app/cmd/thanos/main.go:161\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"

@yeya24
Copy link
Contributor

yeya24 commented Sep 12, 2023

@verejoel The error happened when downsample a 5m block to a 1h block, right?
I guess because your 5m block is already corrupted. This pr doesn't fix that. You need to find your raw blocks and downsample to the correct 5m block, then 1h block.

In this case, you can add no downsample marker to your 5m blocks to avoid such failures.

@verejoel
Copy link
Contributor

@yeya24 Correct, thanks for clarifying this.

coleenquadros pushed a commit to coleenquadros/thanos that referenced this pull request Sep 18, 2023
…6598)

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added test for chunk with NaN values only

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* CHANGELOG.md

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added check for math.NaN

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* optimized NaN checking

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

---------

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
coleenquadros pushed a commit to coleenquadros/thanos that referenced this pull request Sep 18, 2023
…6598)

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added test for chunk with NaN values only

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* CHANGELOG.md

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added check for math.NaN

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* optimized NaN checking

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

---------

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
@BouchaaraAdil
Copy link
Contributor

@verejoel The error happened when downsample a 5m block to a 1h block, right? I guess because your 5m block is already corrupted. This pr doesn't fix that. You need to find your raw blocks and downsample to the correct 5m block, then 1h block.

In this case, you can add no downsample marker to your 5m blocks to avoid such failures.

@yeya24, @verejoel can you please list the steps to fix that, I'm using thanos 0.34.1 and blocked at "sum and count timestamps not aligned"

openshift-merge-bot bot pushed a commit to stolostron/thanos that referenced this pull request May 13, 2024
* Cut release candidate `v0.32.0-rc.1` (#6630)

* store: fix missing flush when handling pushed down queries (#6612)

In the case that we have pushed down queries and internal labels that
are overriden by external labels we are not flushing the sorted response.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate v0.32.0-rc.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>

* queryfrontend: fix explanation with query_range (#6633)

* Cut final release for `v0.32.0` (#6634)

* queryfrontend: fix explanation with query_range (#6633)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cut final release candidate for v0.32.0

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Correct version

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update shepherd doc and fix release link

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update CHANGELOG.md (#6640)

The marked change requires that users set a security context so that mounted volumes (PVCs in particular) will be writable by the `thanos` user.

Signed-off-by: verejoel <j.verezhak@gmail.com>

* store: fix error handling in decodePostings (#6650)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* store: fix ignored error in postings (#6654)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix bufio pool handling (#6655)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Add `--disable-admin-operations` flag in Compactor UI and Bucket UI (#6646)

* adding flags

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* adding docs

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* fixing tools.md

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* fixing tools.md

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* adding changelog

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* fixing changelog

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

---------

Signed-off-by: Harsh Pratap Singh <119954739+harsh-ps-2003@users.noreply.github.com>

* Fix mutable stringset memory usage (#6669)

This commit fixes the Insert function for the mutable stringset
to only insert unique labels instead of adding every label to the set.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut patch release `v0.32.1` (#6670)

* store: fix error handling in decodePostings (#6650)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* store: fix ignored error in postings (#6654)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix bufio pool handling (#6655)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Fix mutable stringset memory usage (#6669)

This commit fixes the Insert function for the mutable stringset
to only insert unique labels instead of adding every label to the set.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut patch release v0.32.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update thanos engine and Prometheus dependencies (#6664)

* Update thanos engine and Prometheus dependencies

This commit bumps thanos/promql-engine to latest main and resolves
breaking changes from the prometheus/prometheus dependency.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add changelog entry

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Avoid closing head more than once

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Remove call to t.TempDir()

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut patch release `v0.32.1` (#6670) (#6673)

* store: fix error handling in decodePostings (#6650)



* store: fix ignored error in postings (#6654)



* Store: fix bufio pool handling (#6655)



* Fix mutable stringset memory usage (#6669)

This commit fixes the Insert function for the mutable stringset
to only insert unique labels instead of adding every label to the set.



* Cut patch release v0.32.1



---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

* store: fix race when iterating blocks (#6675)

* build(deps): bump github.com/prometheus/alertmanager (#6671)

Bumps [github.com/prometheus/alertmanager](https://github.com/prometheus/alertmanager) from 0.25.0 to 0.25.1.
- [Release notes](https://github.com/prometheus/alertmanager/releases)
- [Changelog](https://github.com/prometheus/alertmanager/blob/v0.25.1/CHANGELOG.md)
- [Commits](https://github.com/prometheus/alertmanager/compare/v0.25.0...v0.25.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/alertmanager
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build with Go 1.21 (#6615)

* Build with Go 1.21

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update tools

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* store: add acceptance tests for label methods to bucket store (#6668)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* store: Record stats even on ExpandPostings error (#6679)

* Store: fix forgotten field in store stats merge (#6681)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix postings reader short reads (#6684)

bufio.Reader can return less bytes than needed. Go documentation
suggests to use io.ReadFull

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut patch release `v0.32.2` (#6685)

* store: fix race when iterating blocks (#6675)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* store: Record stats even on ExpandPostings error (#6679)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Store: fix forgotten field in store stats merge (#6681)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix postings reader short reads (#6684)

bufio.Reader can return less bytes than needed. Go documentation
suggests to use io.ReadFull

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut patch release v0.32.2

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>

* remove deprecated log.request.decision flag (#6686)

* remove deprecated log.request.decision flag

Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>

* add changelog

Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>

---------

Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>

* Ruler: Add update label names routine for stateful ruler (#6689)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Store: add some acceptance tests for label matching (#6691)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix regex matching with set that matches empty (#6692)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* docs: Update lightstep link (#6694)

* docs: Update lightstep link

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add to mdox config

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Store: add failing test for potential dedup issue (#6693)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Receive: Change write log level from warn to info (#6698)

This commit moves several log lines from `warn` to `info`. These are
non-recoverable/non-actionable situations, which mostly are captured by
metrics such as `prometheus_tsdb_out_of_order_samples_total`.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Store: fix block dedup (#6697)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Query: Add pop-up when Explain Checkbox is disabled (#6662)

* Added popup when hovering

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Small temp fixes

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Reverting temp changes

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Fixed pop-up

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Solved infinite loop caused by useState function

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* reverted htmlFor

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Fixed the tests

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Small fixes

Signed-off-by: Luis Marques <henriquecondado@gmail.com>

* Adding explanation to pop-up text

Signed-off-by: Luís Marques <48833236+lmarques03@users.noreply.github.com>

---------

Signed-off-by: Luis Marques <henriquecondado@gmail.com>
Signed-off-by: Luís Marques <48833236+lmarques03@users.noreply.github.com>

* Optimize postings fetching by checking postings and series size (#6465)

* optimize postings fetching by checking postings and series size

Signed-off-by: Ben Ye <benye@amazon.com>

* address some review comments

Signed-off-by: Ben Ye <benye@amazon.com>

* add acceptance test and fixed bug of skipping posting groups with add keys

Signed-off-by: Ben Ye <benye@amazon.com>

* add lazy postings param to block series clinet

Signed-off-by: Ben Ye <benye@amazon.com>

* switch to use block estimated max series size

Signed-off-by: Ben Ye <benye@amazon.com>

* added two more metrics

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* compact: data corruption during downsapmle, test and fix. (#6598)

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* Samples to reproduce data corruption during downsapmle, tests and fix.

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added test for chunk with NaN values only

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* CHANGELOG.md

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added check for math.NaN

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* optimized NaN checking

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

---------

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* use single instance of typed error and use errors.Is() for comparison (#6719)

Signed-off-by: Jake Keeys <jake@keeys.org>

* Ruler: Add alert source template (#6308)

* Add alert source template in rule

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Validate template in start phase

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Move the start check to runrule

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* move the flag to config.go

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Updates the docs

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Add test for validateTemplate

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Add new test case

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Remove unnecessary variable

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Add changelogs

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>

* Update CHANGELOG.md

Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>

---------

Signed-off-by: Zhuoyuan Liu <zhuoyuan.liu@maersk.com>
Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>

* Add Shipper bytes uploaded metric #6438 (#6544)

* [FEAT] Add uploaded bytes metric

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Add PR number to log

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Log msg

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Clean code

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Remove shadow code

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Go format

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update objstore

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update objstore package

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update storage.md

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update erroring bucket

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update erroring bucket

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

---------

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* Update objstore library to latest main (#6722)

This commit updates the obstore library to the latest main version
which optimizes the Iter operation to only request object names.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Store: store responses should always be sorted (#6706)

* Store: always sort, just compare labelset in proxy heap

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: add escape hatch to skip store resorting

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: remove stringset

This is the wrong approach to detect if we need to resort. It cannot
detect if we might end up with an unsorted series set if we add
extLabels.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Docs: drop paragraph about deduplication on inner labels

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Michael Hoffmann <michael.hoffmann@aiven.io>

* Updates busybox SHA (#6724)

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: fpetkovski <fpetkovski@users.noreply.github.com>

* Add BB as an Adopte (#6725)

Signed-off-by: Fernando Vargas <Fernando.Vargas@ibm.com>
Co-authored-by: C1323453 Fernando Vargas Teotonio De Oliveira <fernando@Fernandos-MacBook-Air-2.local>

* add get_all_duration and merge_duration to SG query hints (#6730)

Signed-off-by: Ben Ye <benye@amazon.com>

* Add absolute total download time metrics for series and chunks (#6726)

* add metrics for absolute latency of loading series and chunks per block

Signed-off-by: Ben Ye <benye@amazon.com>

* fix lint

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* fix bug when merging query stats for chunkFetchDurationSum

Signed-off-by: Ben Ye <benye@amazon.com>

* add tests for stats merge

Signed-off-by: Ben Ye <benye@amazon.com>

* Cut patch release `v0.32.3` (#6736)

* Update thanos engine and Prometheus dependencies (#6664)

* Update thanos engine and Prometheus dependencies

This commit bumps thanos/promql-engine to latest main and resolves
breaking changes from the prometheus/prometheus dependency.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add changelog entry

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Avoid closing head more than once

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Remove call to t.TempDir()

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* build(deps): bump github.com/prometheus/alertmanager (#6671)

Bumps [github.com/prometheus/alertmanager](https://github.com/prometheus/alertmanager) from 0.25.0 to 0.25.1.
- [Release notes](https://github.com/prometheus/alertmanager/releases)
- [Changelog](https://github.com/prometheus/alertmanager/blob/v0.25.1/CHANGELOG.md)
- [Commits](https://github.com/prometheus/alertmanager/compare/v0.25.0...v0.25.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/alertmanager
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* store: add acceptance tests for label methods to bucket store (#6668)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Ruler: Add update label names routine for stateful ruler (#6689)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Store: add some acceptance tests for label matching (#6691)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix regex matching with set that matches empty (#6692)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: add failing test for potential dedup issue (#6693)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: fix block dedup (#6697)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Add Shipper bytes uploaded metric #6438 (#6544)

* [FEAT] Add uploaded bytes metric

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Add PR number to log

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Log msg

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Clean code

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Remove shadow code

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FIX] Go format

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update objstore

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update objstore package

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update storage.md

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update erroring bucket

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* [FEAT] Update erroring bucket

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

---------

Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>

* Update objstore library to latest main (#6722)

This commit updates the obstore library to the latest main version
which optimizes the Iter operation to only request object names.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Store: store responses should always be sorted (#6706)

* Store: always sort, just compare labelset in proxy heap

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: add escape hatch to skip store resorting

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Store: remove stringset

This is the wrong approach to detect if we need to resort. It cannot
detect if we might end up with an unsorted series set if we add
extLabels.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Docs: drop paragraph about deduplication on inner labels

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Michael Hoffmann <michael.hoffmann@aiven.io>

* Updates busybox SHA (#6724)

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: fpetkovski <fpetkovski@users.noreply.github.com>

* Cut patch release v0.32.3

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: rita.canavarro <rita.canavarro@farfetch.com>
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Rita Canavarro <98762287+ritaCanavarro@users.noreply.github.com>
Co-authored-by: Michael Hoffmann <michael.hoffmann@aiven.io>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: fpetkovski <fpetkovski@users.noreply.github.com>

* update objstore version to latest (#6746)

Signed-off-by: Ben Ye <benye@amazon.com>

* update go alpine image to 3.18 (#6750)

Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>

* StoreGateway: Add a metric to track block load duration (#6729)

* BinaryIndexReader: always lookup name symbol first (#6741)

* always lookup name symbol first

Signed-off-by: Ben Ye <benye@amazon.com>

* add tests to verify

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* Add latency histogram for fetching index cache (#6749)

* add latency histogram for fetching index cache

Signed-off-by: Ben Ye <benye@amazon.com>

* update changelog

Signed-off-by: Ben Ye <benye@amazon.com>

* use timer

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* Fix for mixin workflow actions rules check step failed cases (#6753)

* Fix for mixin check step - rules.yaml

Signed-off-by: l.preethvika <preethivika1999@gmail.com>

* Fixed the mixin rules with duplicate names
Modified the mixin rules and changelog

Signed-off-by: preethivika <preethivika1999@gmail.com>

* Update the promtool from v0.37.0 to v0.47.0

Signed-off-by: preethivika <preethivika1999@gmail.com>

* Update the promtool changelog

Signed-off-by: preethivika <preethivika1999@gmail.com>

* Updated the promtool changelog

Signed-off-by: preethivika <preethivika1999@gmail.com>

---------

Signed-off-by: l.preethvika <preethivika1999@gmail.com>
Signed-off-by: preethivika <preethivika1999@gmail.com>
Co-authored-by: l-preethvika <l.preethvika@samsung.com>

* Store: Don't hardcode series batch size (#6761)

* not hardcode series batch size

Signed-off-by: Ben Ye <benye@amazon.com>

* fix unit test

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* fix index fetch latency metric timer (#6758)

Signed-off-by: Ben Ye <benye@amazon.com>

* added tls config in downstream query (#6760)

* added tls config

Signed-off-by: bazooka3000 <dattatreya.git@gmail.com>

* docs

Signed-off-by: bazooka3000 <dattatreya.git@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Dattatreya <146561544+bazooka3000@users.noreply.github.com>

* lint check

Signed-off-by: bazooka3000 <dattatreya.git@gmail.com>

---------

Signed-off-by: bazooka3000 <dattatreya.git@gmail.com>
Signed-off-by: Dattatreya <146561544+bazooka3000@users.noreply.github.com>
Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add improbable.io to mdox ignore (#6764)

* Add improbable.io to mdox ignore

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Run make docs

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cut patch release `v0.32.4` (#6763)

* update objstore version to latest (#6746)

Signed-off-by: Ben Ye <benye@amazon.com>

* Cut patch release v0.32.4

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Ben Ye <benye@amazon.com>

* Target Ui: Fixed responsiveness of Search Bar (#6642)

* Target Ui: Fixed responsiveness of Search Bar

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Rebuild

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Rebuild

Signed-off-by: Vanshika <vanshikav928@gmail.com>

---------

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Enabled Navbar to automatically close on navigation (#6656)

* Enabled Navbar to automatically close on navigation

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Rebuild

Signed-off-by: Vanshika <vanshikav928@gmail.com>

---------

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Force Tracing : checkbox in query frontend to force a trace to be collected (#6770)

* force tracing

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* force tracing

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Rebuild

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* changes force Tracing

Signed-off-by: Vanshika <vanshikav928@gmail.com>

---------

Signed-off-by: Vanshika <vanshikav928@gmail.com>

* Store: Add tenant label to exported metrics (#6690)

* Store: Add tenant label to exported metrics

With this commit we add a tenant label to relevant metrics exported by
the store gateway.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Query: Don't hide tenant related cmd args

As we now have some value of these args, with store metrics being
enhanced with tenant information, we no longer hide these tenant flags.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Query: Make default-tenant flag match receive

Ensure that the commandline flag matches what we currently have on
receive.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Promclient: Use http.header type for headers

Instead of using `map[string]string` for adding additional headers to
requests in `req2xx`.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Store: Add warning about tenant label to changelog

Adds a more clear warning to the Changelog regarding that the added
tenant label could potentially cause issues for custom dashboards.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

---------

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* StoreGateway: Partition index-header download (#6747)

* Partition index-header download

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Use int division instead of float

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Ignore errors in close()

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix e2e

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Use disk to buffer parts of index-header

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix lint

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Renaming variables

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Increase partition size

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix e2e failures

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Refactoring

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix e2e

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix lint

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix e2e

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Cosmetic changes

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Address review comments

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

---------

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Support filtered index cache (#6765)

* support filtered index cache

Signed-off-by: Ben Ye <benye@amazon.com>

* changelog

Signed-off-by: Ben Ye <benye@amazon.com>

* fix doc

Signed-off-by: Ben Ye <benye@amazon.com>

* fix unit test failure

Signed-off-by: Ben Ye <benye@amazon.com>

* add item type validation

Signed-off-by: Ben Ye <benye@amazon.com>

* lint

Signed-off-by: Ben Ye <benye@amazon.com>

* change enabled_items to []string type

Signed-off-by: Ben Ye <benye@amazon.com>

* generate docs

Signed-off-by: Ben Ye <benye@amazon.com>

* separate validation code

Signed-off-by: Ben Ye <benye@amazon.com>

* fix lint

Signed-off-by: Ben Ye <benye@amazon.com>

* update doc

Signed-off-by: Ben Ye <benye@amazon.com>

* fix interface

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* use rwmutex for value symbols cache (#6778)

Signed-off-by: Ben Ye <benye@amazon.com>

* *: bump prometheus and promql-engine (#6772)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Ben Ye <benye@amazon.com>

* fix nil pointer bug when closing reader (#6781)

Signed-off-by: Ben Ye <benye@amazon.com>

* Store Gateway: Allow skipping resorting (#6779)

* allow skipping resorting in thanos eager respSet

Signed-off-by: Ben Ye <benye@amazon.com>

* address comments

Signed-off-by: Ben Ye <benye@amazon.com>

* fix unit test

Signed-off-by: Ben Ye <benye@amazon.com>

* address review feedback

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* make index cache ttl configurable (#6773)

Signed-off-by: Ben Ye <benye@amazon.com>

* bump prometheus to latest main (#6783)

Signed-off-by: Ben Ye <benye@amazon.com>

* check context cancel in inmemory cache (#6788)

Signed-off-by: Ben Ye <benye@amazon.com>

* Query Analysis (#6515)

* Return Query Analysis in API

A param  is added to QueryAPI, if true then query analysis is
returned by the  method of the query having structure
 is returned in response.

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Added analyze checkbox in Thanos UI

A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Return Query Analysis in API

A param  is added to QueryAPI, if true then query analysis is
returned by the  method of the query having structure
 is returned in response.

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Added analyze checkbox in Thanos UI

A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Add query explain API

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Hooked queryTelemetry data into UI

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* /query_explain and /query_range_explain for explain-tree

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* update promql-engine

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Execution time shows 0s

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Show execution time of operators

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Removing QueryExplainParam from query api

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* bad request format in Explain

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Showing Expalin and Analyze Output

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Added tooltip and different enpoints for table and graph queries

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Linters pass

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* disable Explain when engine is 'prometheus'

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* passing query params to explain endpoints

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* fixed react test case failing

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* fix ui tests

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* fix some e2e test fails

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* added customised tooltip in place of Tooltip component

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* removed Tooltip from Panel

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* Linters pass

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* 4 arguments in QueryInstant

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* resolving conflicts -2

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* resolving conflicts in Panel.tsx

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* adding checkbox

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

* fixing linters fail

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>

---------

Signed-off-by: nishchay-veer <nishchayveer19@gmail.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Nishchay Veer <99465982+nishchay-veer@users.noreply.github.com>
Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* react-app/ListTree: only show symbol when analyze enabled (#6789)

No need to show the symbol if analyze is disabled. It looks weird. Let's
not do that.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* test/e2e: fix same environment names (#6790)

Two of the same names are used in e2e environment names. Fix this name
clash.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Add dialer_timeout field to HTTP TransportConfig (#6786)

* set dialer timeout to 5s in NewRoundTripperFromConfig

Signed-off-by: Walther Lee <walther.lee@reddit.com>

* add dialer_timeout field to HTTP TransportConfig

Signed-off-by: Walther Lee <walther.lee@reddit.com>

---------

Signed-off-by: Walther Lee <walther.lee@reddit.com>
Co-authored-by: Walther Lee <walther.lee@reddit.com>

* api/blocks: fix race between get/set (#6791)

Running tests with -race shows that there is a race between
bapi.blocks() and bapi.SetLoaded/SetGlobal() because the latter is
called continuously and asynchronously in a different thread. blocks()
is called through the HTTP API. Since block info is immutable, it is
enough to add a lock here to fix this problem.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Bucket reader: Initialize new query stats struct at each goroutine (#6787)

* initialize new query stats struct at each goroutine

Signed-off-by: Ben Ye <benye@amazon.com>

* remove comment

Signed-off-by: Ben Ye <benye@amazon.com>

* address feedback

Signed-off-by: Ben Ye <benye@amazon.com>

* fix lint

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* use larger histogram bucket for thanos_bucket_store_series_result_series metric (#6792)

Signed-off-by: Ben Ye <benye@amazon.com>

* api/query: create engines once (#6793)

Fix a race where GetPrometheusEngine or GetThanosEngine is called twice
at the same time from multiple HTTP requests. This fixes the race:

```
10:29:50 querier-query: ==================
10:29:50 querier-query: WARNING: DATA RACE
10:29:50 querier-query: Write at 0x00c0005fa0f8 by goroutine 285:
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:105 +0x1f9
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
...
10:29:50 querier-query: Previous read at 0x00c0005fa0f8 by goroutine 287:
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:101 +0x13d
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605
10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
...
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* store/proxy: fix label values span (#6795)

Each tracing.StartSpan() writes a value into the given context so
there's a race if we keep reusing the same context. Fix this by starting
a new span in each goroutine. This also makes logical sense. Fixes the
following race:

```
15:21:13 querier-1: WARNING: DATA RACE
15:21:13 querier-1: Read at 0x00c0009c5050 by goroutine 328:
15:21:13 querier-1: context.(*valueCtx).Value()
15:21:13 querier-1: /usr/local/go/src/context/context.go:751 +0x76
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.newClientSpanFromContext()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/interceptors/tracing/client.go:87 +0x241
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.(*opentracingClientReportable).ClientReporter()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/interceptors/tracing/client.go:51 +0x195
15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.UnaryClientInterceptor.UnaryClientInterceptor.func1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/interceptors/client.go:19 +0x1a9
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/chain.go:74 +0x10a
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.(*ClientMetrics).UnaryClientInterceptor.func3()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/client_metrics.go:112 +0x126
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/chain.go:74 +0x10a
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4()
15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/v2@v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7/chain.go:83 +0x17b
15:21:13 querier-1: google.golang.org/grpc.(*ClientConn).Invoke()
15:21:13 querier-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/call.go:35 +0x25d
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*storeClient).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1034 +0xe5
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*endpointRef).LabelValues()
15:21:13 querier-1: <autogenerated>:1 +0xa1                                                                                                                                        15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues.func1()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:586 +0x323
15:21:13 querier-1: golang.org/x/sync/errgroup.(*Group).Go.func1()
15:21:13 querier-1: /go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75 +0x76
15:21:13 querier-1: Previous write at 0x00c0009c5050 by goroutine 325:
15:21:13 querier-1: context.WithValue()
15:21:13 querier-1: /usr/local/go/src/context/context.go:718 +0xce
15:21:13 querier-1: github.com/opentracing/opentracing-go.ContextWithSpan()
15:21:13 querier-1: /go/pkg/mod/github.com/opentracing/opentracing-go@v1.2.0/gocontext.go:17 +0xec
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/tracing.StartSpan()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:73 +0x238
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:567 +0xb25
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).LabelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/query/querier.go:422 +0x3f5
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues()
15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:1092 +0x17d1
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues-fm()
15:21:13 querier-1: <autogenerated>:1 +0x45
15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1()
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* compact: return metas copy from syncer (#6801)

Return copy of the map because the compactor runs garbage collector
concurrently that deletes entries from the original map. Fixes race:

```
10:55:35 compact-working-dedup: ==================
10:55:35 compact-working-dedup: WARNING: DATA RACE
10:55:35 compact-working-dedup: Write at 0x00c001822150 by goroutine 220:
10:55:35 compact-working-dedup: runtime.mapdelete()
10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:696 +0x0
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*Syncer).GarbageCollect()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:201 +0x324
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*BucketCompactor).Compact()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:1422 +0x60f
10:55:35 compact-working-dedup: main.runCompact.func7()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:426 +0xfa
10:55:35 compact-working-dedup: main.runCompact.func8.1()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:481 +0x69
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
10:55:35 compact-working-dedup: main.runCompact.func8()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:480 +0x224
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38 +0x39
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:39 +0x4f
10:55:35 compact-working-dedup: Previous read at 0x00c001822150 by goroutine 223:
10:55:35 compact-working-dedup: runtime.mapiternext()
10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:867 +0x0
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*DefaultGrouper).Groups()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:289 +0xfd
10:55:35 compact-working-dedup: main.runCompact.func16.1()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:626 +0x4ae
10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
10:55:35 compact-working-dedup: main.runCompact.func16()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:591 +0x3f9
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38 +0x39
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:39 +0x4f
10:55:35 compact-working-dedup: Goroutine 220 (running) created at:
10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run()
10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:37 +0xad
10:55:35 compact-working-dedup: main.main()
10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/main.go:159 +0x2964
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* build(deps): bump golang.org/x/net from 0.14.0 to 0.17.0 (#6805)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Updates busybox SHA (#6808)

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: fpetkovski <fpetkovski@users.noreply.github.com>

* fix head series limiter trigger (#6802)

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* preallocate series map size (#6807)

Signed-off-by: Ben Ye <benye@amazon.com>

* Fix matchersToPostingGroups vals variable shadow bug (#6817)

* fix matchersToPostingGroups vals variable shadow bug

Signed-off-by: Ben Ye <benye@amazon.com>

* update changelog

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>

* Store: fix prometheus store label values for matches on external labels (#6816)

External Labels should also be tested for matches against the matchers.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* optimize inmemory index cache WithLabelValues call (#6806)

Signed-off-by: Ben Ye <benye@amazon.com>

* add keepalive to EndpointGroupGRPCOpts (#6810)

Signed-off-by: Walther Lee <walthere.lee@gmail.com>

* Cut patch release `v0.32.5` (#6820) (#6822)

* Build with Go 1.21 (#6615)

* Build with Go 1.21



* Update tools



---------



* update go alpine image to 3.18 (#6750)



* build(deps): bump golang.org/x/net from 0.14.0 to 0.17.0 (#6805)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...




* Updates busybox SHA (#6808)




* Fix matchersToPostingGroups vals variable shadow bug (#6817)

* fix matchersToPostingGroups vals variable shadow bug



* update changelog



---------



* fix head series limiter trigger (#6802)



* Store: fix prometheus store label values for matches on external labels (#6816)

External Labels should also be tested for matches against the matchers.



* Cut patch release v0.32.5



* Revert "Fix matchersToPostingGroups vals variable shadow bug (#6817)"

This reverts commit 4ed9bb0317122e9dc31c2548581972c27d4e2e33.



---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: fpetkovski <fpetkovski@users.noreply.github.com>
Co-authored-by: Ben Ye <benye@amazon.com>
Co-authored-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>

* go.mod: update promql-engine (#6823)

Bring https://github.com/thanos-io/promql-engine/pull/320 into Thanos.
Fixes https://github.com/thanos-io/promql-engine/issues/312.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* receive/handler: fix label names/values race (#6825)

* receive/handler: fix label names/values race

There is a label name/value race in the current loop because
`labelpb.ReAllocZLabelsStrings(&t.Labels, r.opts.Intern)` might be
called which overwrites the original labels. At the same time, we might
also be forwarding the same request through gRPC to other Receive nodes.

Fixes the following race:

<details>
<summary>Trace of the race</summary>

10:53:51 receive-1: WARNING: DATA RACE
10:53:51 receive-1: Read at 0x00c001097b90 by goroutine 361:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.(*ZLabel).Size()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:273 +0x35
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).MarshalToSizedBuffer()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/prompb/types.pb.go:1499 +0x7c4
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).MarshalToSizedBuffer()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1318 +0x409
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1286 +0x64
10:53:51 receive-1: google.golang.org/protobuf/internal/impl.legacyMarshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/legacy_message.go:402 +0xb1
10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.marshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/protobuf@v1.31.0/proto/encode.go:166 +0x3a2
10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/protobuf@v1.31.0/proto/encode.go:125 +0x96
10:53:51 receive-1: github.com/golang/protobuf/proto.marshalAppend()
10:53:51 receive-1: /go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:40 +0xce
10:53:51 receive-1: github.com/golang/protobuf/proto.Marshal()
10:53:51 receive-1: /go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:23 +0x65
10:53:51 receive-1: google.golang.org/grpc/encoding/proto.codec.Marshal()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/encoding/proto/proto.go:45 +0x66
10:53:51 receive-1: google.golang.org/grpc/encoding/proto.(*codec).Marshal()
10:53:51 receive-1: <autogenerated>:1 +0x53
10:53:51 receive-1: google.golang.org/grpc.encode()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/rpc_util.go:594 +0x64
10:53:51 receive-1: google.golang.org/grpc.prepareMsg()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/stream.go:1610 +0x1a8
10:53:51 receive-1: google.golang.org/grpc.(*clientStream).SendMsg()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/stream.go:791 +0x284
10:53:51 receive-1: google.golang.org/grpc.invoke()
10:53:51 receive-1: /go/pkg/mod/google.golang.org/grpc@v1.45.0/call.go:70 +0xf2

...
10:53:51 receive-1: Previous write at 0x00c001097b90 by goroutine 357:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.ReAllocZLabelsStrings()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:69 +0x25e
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Writer).Write()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/writer.go:144 +0x13e4
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2.1()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:672 +0x153
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/tracing.DoInSpan()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:95 +0x125
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:671 +0x1fd
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func6()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:682 +0x61
10:53:51 receive-1: Goroutine 361 (running) created at:
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:688 +0x9c7
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).forward()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:612 +0x53a
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).handleRequest()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:417 +0xca8
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:539 +0x1d89
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP-fm()
10:53:51 receive-1: <autogenerated>:1 +0x51
10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP()
10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.NewHandler.RequestID.func2()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/server/http/middleware/request_id.go:40 +0x191
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).testReady-fm.(*Handler).testReady.func1()
10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:263 +0x249
10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP()
10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47
10:53:51 receive-1: github.com/thanos-io/thanos/pkg/extprom/http.httpInstrumentationHandler.func1()

</details>

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* receive/handler: remove break

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* fix devcontainer image (#6828)

Signed-off-by: Ben Ye <benye@amazon.com>

* Block: Expose fetcher and syncer metrics to be provided by depending projects (#6827)

* Expose fetcher and syncer metrics to be provided by depending projects.

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Updated CHANGELOG

Signed-off-by: Alex Le <leqiyue@amazon.com>

* Remove CHANGELOG change

Signed-off-by: Alex Le <leqiyue@amazon.com>

---------

Signed-off-by: Alex Le <leqiyue@amazon.com>

* receive: fix limits reloading race (#6826)

We are re-reading the limits configuration periodically and also reading
it at the same time hence we need a lock around it. Thus, let's make
that struct member private and add a getter that returns the limiter
under a mutex lock.

Fixes:

```
17:14:45 receive-i3: WARNING: DATA RACE
17:14:45 receive-i3: Read at 0x00c00090aec0 by goroutine 131:
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*headSeriesLimit).QueryMetaMonitoring()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:109 +0x2fb
17:14:45 receive-i3: main.runReceive.func9.1()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/cmd/thanos/receive.go:402 +0x9b
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/runutil.Repeat()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3
17:14:45 receive-i3: Previous write at 0x00c00090aec0 by goroutine 138:
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.NewHeadSeriesLimit()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:41 +0x316
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).loadConfig()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:168 +0xd0d
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).StartConfigReloader.func1()
17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:111 +0x207
17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/extkingpin.(*pollingEngine).start.func1()
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* query: fix hints race (#6831)

Fix the following race:

```
12:36:39 querier-1: ==================
12:36:39 querier-1: WARNING: DATA RACE
12:36:39 querier-1: Read at 0x00c000159540 by goroutine 341:
12:36:39 querier-1: reflect.Value.String()
12:36:39 querier-1: /usr/local/go/src/reflect/value.go:2589 +0xd76
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:563 +0xd86
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:325 +0x19db
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:606 +0xb2a
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:453 +0xdd6
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:606 +0xb2a
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:453 +0xdd6
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Marshal()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:896 +0x5c8
12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Text()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:908 +0x92
12:36:39 querier-1: github.com/gogo/protobuf/proto.CompactTextString()
12:36:39 querier-1: /go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/text.go:930 +0x8e
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*SeriesRequest).String()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:316 +0x7b
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).Series()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:277 +0x8f
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).selectFn()

12:36:39 querier-1: Previous write at 0x00c000159540 by goroutine 339:
12:36:39 querier-1: golang.org/x/exp/slices.insertionSortOrdered[go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/exp@v0.0.0-20230801115018-d63ba01acd4b/slices/zsortordered.go:15 +0x357
12:36:39 querier-1: golang.org/x/exp/slices.pdqsortOrdered[go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/exp@v0.0.0-20230801115018-d63ba01acd4b/slices/zsortordered.go:75 +0x72f
12:36:39 querier-1: golang.org/x/exp/slices.Sort[go.shape.[]string,go.shape.string]()
12:36:39 querier-1: /go/pkg/mod/golang.org/x/exp@v0.0.0-20230801115018-d63ba01acd4b/slices/sort.go:19 +0x45a
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).eval()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/prometheus@v0.47.2-0.20231009162353-f6d9c84fde6b/promql/engine.go:1352 +0x432
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).Eval()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/prometheus@v0.47.2-0.20231009162353-f6d9c84fde6b/promql/engine.go:1052 +0x105
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).execEvalStmt()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/prometheus@v0.47.2-0.20231009162353-f6d9c84fde6b/promql/engine.go:708 +0xb15
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).exec()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/prometheus@v0.47.2-0.20231009162353-f6d9c84fde6b/promql/engine.go:646 +0x4c8
12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*query).Exec()
12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/prometheus@v0.47.2-0.20231009162353-f6d9c84fde6b/promql/engine.go:235 +0x232
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:681 +0xdfd
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm()
12:36:39 querier-1: <autogenerated>:1 +0x45
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1()
12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/api.go:212 +0x62
12:36:39 querier-1: net/http.HandlerFunc.ServeHTTP()
12:36:39 querier-1: /usr/local/go/src/net/http/server.go:2136 +0x47
12:36:39 querier-1: github.com/thanos-io/thanos/pkg/logging.(*HTTPServerMiddleware).HTTPMiddleware.func1()
```

Problem is that the same slice is sorted in the PromQL engine whereas
the same hints slice could still be used in other Select() calls where
String() is called and then it reads those hints.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Adding Grupo Olx as user (#6832)

* Adding Grupo Olx as user

Signed-off-by: Nelson Almeida <nelsonmarcos@gmail.com>

* Adding Grupo OLX logo

Signed-off-by: Nelson Almeida <nelsonmarcos@gmail.com>

---------

Signed-off-by: Nelson Almeida <nelsonmarcos@gmail.com>

* Query: Add tenant label to exported metrics (#6794)

* Receive: Add default tenant to HTTP metrics

Previously, if the tenant header was empty/not supplied, the exported
metrics would have an empty string as tenant. With this commit we
instead use the default tenant as can be configured with:
`--receive.default-tenant-id`.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Query: Add tenant label to exported metrics

With this commit we now add the tenant label to relevant metrics
exported by the query component.

This includes the HTTP metrics handled by the InstrumentationMiddleware
and the query latency metrics.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

---------

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>

* Nit: allocate slice capacity correctly during intersection (#6819)

* Fix: Removes Deprecated ioutil (#6834)

* Fix: Removes Deprecated ioutil

In Go, io/ioutil has been recently deprecated in favor of the drop in replacements "io" and "os". With the exception of the generated code in the file marked "DO NOT EDIT", this commit addresses those instances of ioutil with the respective function replacements.

Happy Hacktoberfest! Thank you for taking a moment to review my PR!

Signed-off-by: donuts-are-good <96031819+donuts-are-good@users.noreply.github.com>

* Adds Changelog entry

Completing the request for a changelog entry.

Signed-off-by: donuts-are-good <96031819+donuts-are-good@users.noreply.github.com>

* Removes Changelog Entry

This commit removes the ioutil changes in this PR, as they are not user-facing issues

Signed-off-by: donuts-are-good <96031819+donuts-are-good@users.noreply.github.com>

---------

Signed-off-by: donuts-are-good <96031819+donuts-are-good@users.noreply.github.com>

* vertically shard queries by le if no histogram_quantile function (#6809)

Signed-off-by: Ben Ye <benye@amazon.com>

* Expose more overridable metrics from fetcher and default grouper (#6836)

* Expose more overridable metrics from fetcher and default grouper

Signed-off-by: Alex Le <leqiyue@amazon.com>

* fix test

Signed-off-by: Alex Le <leqiyue@amazon.com>

* rename new functions

Signed-off-by: …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants