reduce cost of pipeline stages by pooling maps we create for labels #11484

cstyan · 2023-12-14T01:22:25Z

If we look at profiling of ingesters in some of our production deployments during which we have high ingester load from querying we see that we spend roughly 15-20% of our CPU time on GC related work.

Looking at different runtime memory related calls, it looks like ~50% of all CPU time during Pipeline stage processing is the creation of the map for labels that is then passed to text template parsing. This change will make the pipeline processing itself ~15% faster but hopefully will also reduce (very slightly) the impact of GC in heavy query load situations.

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 9 5950X 16-Core Processor
                                    │     base     │                mapPool                │
                                    │    sec/op    │    sec/op     vs base                 │
_Pipeline/pipeline_bytes-32           7.749µ ± ∞ ¹   6.464µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          7.691µ ± ∞ ¹   6.549µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     7.922µ ± ∞ ¹   6.882µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    7.900µ ± ∞ ¹   6.878µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    7.952µ ± ∞ ¹   6.976µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   7.940µ ± ∞ ¹   7.102µ ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               7.858µ         6.805µ        -13.41%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │     base      │                mapPool                 │
                                    │     B/op      │     B/op       vs base                 │
_Pipeline/pipeline_bytes-32           4.963Ki ± ∞ ¹   1.413Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          5.025Ki ± ∞ ¹   1.476Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     5.030Ki ± ∞ ¹   1.478Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    5.028Ki ± ∞ ¹   1.478Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    5.027Ki ± ∞ ¹   1.478Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   5.027Ki ± ∞ ¹   1.478Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               5.017Ki         1.466Ki        -70.77%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │    base     │               mapPool               │
                                    │  allocs/op  │  allocs/op   vs base                │
_Pipeline/pipeline_bytes-32           37.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          38.00 ± ∞ ¹   36.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     37.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    37.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    37.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   37.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                               37.16         35.16        -5.38%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

Signed-off-by: Callum Styan <callumstyan@gmail.com>

github-actions · 2023-12-14T01:25:40Z

Trivy scan found the following vulnerabilities:

HIGH, Target: docker.io/grafana/loki:main-be71a80 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libcrypto3 v3.1.3-r0. Fixed in v3.1.4-r0
HIGH, Target: docker.io/grafana/loki:main-be71a80 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libssl3 v3.1.3-r0. Fixed in v3.1.4-r0
\nTo see more details on these vulnerabilities, and how/where to fix them, please run docker build -t grafana/loki:main-be71a80 -f cmd/loki/Dockerfile .
trivy i grafana/loki:main-be71a80 on your branch. If these were not introduced by your PR, please considering fixing them in via a subsequent PR. Thanks!

Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan · 2023-12-14T20:48:34Z

Benchmark for latest commit. Additional pooling of the map at the beginning of LabelsFormatter.Process which needs the introduction of a LabelsBuilder.IntoMap function.

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 9 5950X 16-Core Processor
                                    │     base     │                mapPool                │
                                    │    sec/op    │    sec/op     vs base                 │
_Pipeline/pipeline_bytes-32           7.749µ ± ∞ ¹   6.356µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          7.691µ ± ∞ ¹   6.377µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     7.922µ ± ∞ ¹   6.676µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    7.900µ ± ∞ ¹   6.797µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    7.952µ ± ∞ ¹   6.954µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   7.940µ ± ∞ ¹   6.749µ ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               7.858µ         6.648µ        -15.40%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │     base      │                mapPool                 │
                                    │     B/op      │     B/op       vs base                 │
_Pipeline/pipeline_bytes-32           4.963Ki ± ∞ ¹   1.389Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          5.025Ki ± ∞ ¹   1.451Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     5.030Ki ± ∞ ¹   1.454Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    5.028Ki ± ∞ ¹   1.454Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    5.027Ki ± ∞ ¹   1.454Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   5.027Ki ± ∞ ¹   1.454Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               5.017Ki         1.443Ki        -71.25%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │    base     │               mapPool                │
                                    │  allocs/op  │  allocs/op   vs base                 │
_Pipeline/pipeline_bytes-32           37.00 ± ∞ ¹   33.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/pipeline_string-32          38.00 ± ∞ ¹   34.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_bytes-32     37.00 ± ∞ ¹   33.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/line_extractor_string-32    37.00 ± ∞ ¹   33.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_bytes-32    37.00 ± ∞ ¹   33.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Pipeline/label_extractor_string-32   37.00 ± ∞ ¹   33.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               37.16         33.16        -10.76%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

MichelHollands · 2023-12-15T15:32:39Z

pkg/logql/log/fmt.go

@@ -380,7 +385,12 @@ func (lf *LabelsFormatter) Process(ts int64, l []byte, lbs *LabelsBuilder) ([]by
 	lf.currentLine = l
 	lf.currentTs = ts

-	var data interface{}
+	var data = stringMapPool.Get().(map[string]string)
+	clear(data)


(nit) Would it make a difference if the clear() is done in the defer?

Unless I am misunderstanding how some of this works under the hood, it seems like unfortunately the text template execution is using references to the data in the map, so if we clear at the exit of this function rather than when we first grab something from the pool the tests will fail.

I've modified things so that there's a wrapper struct for the pool that calls clear for us when we grab something from the pool, just to make things cleaner.

dannykopping · 2023-12-15T20:52:43Z

Can you share a couple profiles showing this? (probably best to export the pprof files from pyroscope and attach here instead of linking, for the sake of the community)

MasslessParticle · 2023-12-15T21:05:44Z

pkg/logql/log/labels.go

@@ -437,6 +438,32 @@ func (b *LabelsBuilder) UnsortedLabels(buf labels.Labels, categories ...LabelCat
 	return buf
 }

+var stringMapPool = sync.Pool{


Would it make sense to wrap this in a specific type so we don't have to typcast when we decide to use it?

funnily enough, I did exactly this already but forgot to push it

MasslessParticle · 2023-12-15T21:14:59Z

This seems like a cool idea, in general. I've left a comment with a potential QOL improvement for someone using it.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan · 2023-12-15T22:27:04Z

Can you share a couple profiles showing this? (probably best to export the pprof files from pyroscope and attach here instead of linking, for the sake of the community)

I actually don't know how to export profiles from pyroscope still, but here's a screenshot.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

remove mistakenly committed files (benchmarking related) from #11484 Signed-off-by: Callum Styan <callumstyan@gmail.com>

Signed-off-by: Callum Styan <callumstyan@gmail.com>

reverts pooling that causes a panic for concurrent map writes --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>

…rafana#11484) If we look at profiling of ingesters in some of our production deployments during which we have high ingester load from querying we see that we spend roughly 15-20% of our CPU time on GC related work. Looking at different `runtime` memory related calls, it looks like ~50% of all CPU time during Pipeline stage processing is the creation of the map for labels that is then passed to text template parsing. This change will make the pipeline processing itself ~15% faster but hopefully will also reduce (very slightly) the impact of GC in heavy query load situations. ``` goos: linux goarch: amd64 pkg: github.com/grafana/loki/pkg/logql/log cpu: AMD Ryzen 9 5950X 16-Core Processor │ base │ mapPool │ │ sec/op │ sec/op vs base │ _Pipeline/pipeline_bytes-32 7.749µ ± ∞ ¹ 6.464µ ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/pipeline_string-32 7.691µ ± ∞ ¹ 6.549µ ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_bytes-32 7.922µ ± ∞ ¹ 6.882µ ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_string-32 7.900µ ± ∞ ¹ 6.878µ ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_bytes-32 7.952µ ± ∞ ¹ 6.976µ ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_string-32 7.940µ ± ∞ ¹ 7.102µ ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 7.858µ 6.805µ -13.41% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 │ base │ mapPool │ │ B/op │ B/op vs base │ _Pipeline/pipeline_bytes-32 4.963Ki ± ∞ ¹ 1.413Ki ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/pipeline_string-32 5.025Ki ± ∞ ¹ 1.476Ki ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_bytes-32 5.030Ki ± ∞ ¹ 1.478Ki ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_string-32 5.028Ki ± ∞ ¹ 1.478Ki ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_bytes-32 5.027Ki ± ∞ ¹ 1.478Ki ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_string-32 5.027Ki ± ∞ ¹ 1.478Ki ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 5.017Ki 1.466Ki -70.77% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 │ base │ mapPool │ │ allocs/op │ allocs/op vs base │ _Pipeline/pipeline_bytes-32 37.00 ± ∞ ¹ 35.00 ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/pipeline_string-32 38.00 ± ∞ ¹ 36.00 ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_bytes-32 37.00 ± ∞ ¹ 35.00 ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/line_extractor_string-32 37.00 ± ∞ ¹ 35.00 ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_bytes-32 37.00 ± ∞ ¹ 35.00 ± ∞ ¹ ~ (p=1.000 n=1) ² _Pipeline/label_extractor_string-32 37.00 ± ∞ ¹ 35.00 ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 37.16 35.16 -5.38% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 ``` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>

remove mistakenly committed files (benchmarking related) from grafana#11484 Signed-off-by: Callum Styan <callumstyan@gmail.com>

reduce cost of pipeline stages by pooling maps we create for labels

d3bf752

Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan requested a review from a team as a code owner December 14, 2023 01:22

pull-request-size bot added the size/S label Dec 14, 2023

fix for broken tests

8df1b10

Signed-off-by: Callum Styan <callumstyan@gmail.com>

pull-request-size bot added size/M and removed size/S labels Dec 14, 2023

MichelHollands approved these changes Dec 15, 2023

View reviewed changes

MasslessParticle reviewed Dec 15, 2023

View reviewed changes

MasslessParticle approved these changes Dec 15, 2023

View reviewed changes

simplify new pool management with a wrapper

b4095f6

Signed-off-by: Callum Styan <callumstyan@gmail.com>

fix linting about lock copy

e135967

Signed-off-by: Callum Styan <callumstyan@gmail.com>

pull-request-size bot added size/L and removed size/M labels Dec 15, 2023

cstyan merged commit e93243f into main Dec 16, 2023
9 checks passed

cstyan deleted the pipeline-map-pool branch December 16, 2023 02:20

cstyan mentioned this pull request Dec 19, 2023

remove mistakenly committed files from 11484 #11527

Merged

cstyan added a commit that referenced this pull request Dec 19, 2023

remove mistakenly committed files from 11484 (#11527)

48c59ba

remove mistakenly committed files (benchmarking related) from #11484 Signed-off-by: Callum Styan <callumstyan@gmail.com>

JordanRushing mentioned this pull request Jan 4, 2024

[WIP] Convert LabelsBuilder baseMap to *sync.Map from map[string]string since we share the map now to reduce allocations #11582

Closed

8 tasks

cstyan added a commit that referenced this pull request Jan 4, 2024

revert map pooling from #11484

0bd0f78

Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan added a commit that referenced this pull request Jan 4, 2024

revert map pooling from #11484

3ea2fa4

Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan added a commit that referenced this pull request Jan 4, 2024

revert map pooling from #11484 (#11585)

cac5b0a

reverts pooling that causes a panic for concurrent map writes --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>

cstyan mentioned this pull request Jan 5, 2024

builder baseMap concurrent write fix lock #11588

Closed

rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024

remove mistakenly committed files from 11484 (grafana#11527)

93bf032

remove mistakenly committed files (benchmarking related) from grafana#11484 Signed-off-by: Callum Styan <callumstyan@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce cost of pipeline stages by pooling maps we create for labels #11484

reduce cost of pipeline stages by pooling maps we create for labels #11484

cstyan commented Dec 14, 2023

github-actions bot commented Dec 14, 2023 •

edited

Loading

cstyan commented Dec 14, 2023

MichelHollands Dec 15, 2023

cstyan Dec 15, 2023

dannykopping commented Dec 15, 2023

MasslessParticle Dec 15, 2023

cstyan Dec 15, 2023

MasslessParticle commented Dec 15, 2023

cstyan commented Dec 15, 2023

reduce cost of pipeline stages by pooling maps we create for labels #11484

reduce cost of pipeline stages by pooling maps we create for labels #11484

Conversation

cstyan commented Dec 14, 2023

github-actions bot commented Dec 14, 2023 • edited Loading

cstyan commented Dec 14, 2023

MichelHollands Dec 15, 2023

Choose a reason for hiding this comment

cstyan Dec 15, 2023

Choose a reason for hiding this comment

dannykopping commented Dec 15, 2023

MasslessParticle Dec 15, 2023

Choose a reason for hiding this comment

cstyan Dec 15, 2023

Choose a reason for hiding this comment

MasslessParticle commented Dec 15, 2023

cstyan commented Dec 15, 2023

github-actions bot commented Dec 14, 2023 •

edited

Loading