Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdk/trace: use slices.Grow() to avoid excessive runtime.growslice() #4818

Merged
merged 18 commits into from
Jan 12, 2024

Conversation

lizthegrey
Copy link
Member

@lizthegrey lizthegrey commented Jan 9, 2024

Profiling of our otel load generator shows excessive time in growslice. This indicates that we need to expand cap(s.attributes) all at once, rather than one element at a time in the loop.

image

$ ~/hny/go/bin/benchstat /tmp/old /tmp/new
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/trace
cpu: AMD Ryzen 5 5600X 6-Core Processor             
                                              │  /tmp/old   │              /tmp/new               │
                                              │   sec/op    │   sec/op     vs base                │
RecordingSpanSetAttributes/WithLimit/false-12   3.850µ ± 0%   2.721µ ± 2%  -29.32% (p=0.000 n=20)
RecordingSpanSetAttributes/WithLimit/true-12    8.040µ ± 1%   7.362µ ± 0%   -8.44% (p=0.000 n=20)
geomean                                         5.563µ        4.476µ       -19.55%

                                              │   /tmp/old    │               /tmp/new               │
                                              │     B/op      │     B/op      vs base                │
RecordingSpanSetAttributes/WithLimit/false-12   16.406Ki ± 0%   6.844Ki ± 0%  -58.29% (p=0.000 n=20)
RecordingSpanSetAttributes/WithLimit/true-12     15.81Ki ± 0%   11.00Ki ± 0%  -30.43% (p=0.000 n=20)
geomean                                          16.11Ki        8.677Ki       -46.13%

                                              │  /tmp/old   │              /tmp/new              │
                                              │  allocs/op  │ allocs/op   vs base                │
RecordingSpanSetAttributes/WithLimit/false-12   13.000 ± 0%   6.000 ± 0%  -53.85% (p=0.000 n=20)
RecordingSpanSetAttributes/WithLimit/true-12     18.00 ± 0%   12.00 ± 0%  -33.33% (p=0.000 n=20)
geomean                                          15.30        8.485       -44.53%

Copy link

codecov bot commented Jan 9, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (19622d3) 82.3% compared to head (867aeb8) 82.3%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #4818     +/-   ##
=======================================
- Coverage   82.3%   82.3%   -0.1%     
=======================================
  Files        226     226             
  Lines      18410   18421     +11     
=======================================
+ Hits       15157   15164      +7     
- Misses      2969    2971      +2     
- Partials     284     286      +2     
Files Coverage Δ
sdk/trace/span.go 87.2% <63.6%> (-0.7%) ⬇️

Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some benchmarks that would show the reduced number of allocations? Can you add a benchstat result in the description?

sdk/trace/span.go Outdated Show resolved Hide resolved
@lizthegrey
Copy link
Member Author

lizthegrey commented Jan 9, 2024

Interesting, adding the bench shows no change in allocs, which is very suspicious. I looked at my bench and it looks like I am calling SetAttributes() one at a time, which isn't otel's fault, it's my fault, I'll fix my loadgen binary.

@lizthegrey
Copy link
Member Author

Interesting, adding the bench shows no change in allocs, which is very suspicious. I looked at my bench and it looks like I am calling SetAttributes() one at a time, which isn't otel's fault, it's my fault, I'll fix my loadgen binary.

That's because I introduced a bug while inlining. It's fixed now. Description updated.

@lizthegrey lizthegrey requested a review from pellared January 9, 2024 22:59
lizthegrey added a commit to honeycombio/loadgen that referenced this pull request Jan 9, 2024
## Which problem is this PR solving?

- `fielder.AddFields()` is very inefficient, calling
`span.SetAttributes()` once per field set. We can optimise/batch this.


![image](https://github.com/honeycombio/loadgen/assets/614704/3b67f08f-de40-40c7-ae1d-f13c6c8b2ca3)

## Short description of the changes

- Batches calls to `span.SetAttributes()` to ensure efficient memory
allocations, decreasing garbage & growslice. Alternative to try first
before pushing open-telemetry/opentelemetry-go#4818 forward.
lizthegrey added a commit to honeycombio/loadgen that referenced this pull request Jan 9, 2024
## Which problem is this PR solving?

- `span.SetAttributes()` even with batching of the attributes in one
call from #34, will add them all to the inner `s.attributes` object one
at a time, causing excess `growslice` calls.

## Short description of the changes

- Cherry-picks in open-telemetry/opentelemetry-go#4818 for improved
performance.
Copy link
Member

@dmathieu dmathieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do you have access to profiling data after the change?

sdk/trace/span_test.go Show resolved Hide resolved
sdk/trace/span_test.go Show resolved Hide resolved
sdk/trace/span.go Outdated Show resolved Hide resolved
sdk/trace/span.go Outdated Show resolved Hide resolved
sdk/trace/span.go Outdated Show resolved Hide resolved
Copy link
Member Author

@lizthegrey lizthegrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, I'll get that sorted out.

And yes, I have profiles before/after.

sdk/trace/span.go Outdated Show resolved Hide resolved
sdk/trace/span_test.go Show resolved Hide resolved
Co-authored-by: Robert Pająk <pellared@hotmail.com>
Co-authored-by: Damien Mathieu <42@dmathieu.com>
@lizthegrey
Copy link
Member Author

lizthegrey commented Jan 10, 2024

Profiles after: it's much better, and now the problem has shifted to dedupeAttributes which is a separate map allocation problem that the comment says not to pre-allocate and yet is causing us problems in our bench

Screenshot 2024-01-10 15 48 06
Screenshot 2024-01-10 15 47 53

@lizthegrey
Copy link
Member Author

Benchmark updated and bug in the pre-sizing fixed!!

lizthegrey added a commit to honeycombio/opentelemetry-go that referenced this pull request Jan 11, 2024
lizthegrey added a commit to honeycombio/loadgen that referenced this pull request Jan 11, 2024
## Which problem is this PR solving?

- Review found a bug in `addOverCapAttrs()`

## Short description of the changes

- Import fixes from upstream open-telemetry/opentelemetry-go#4818
@lizthegrey
Copy link
Member Author

@MadVikingGod any chance of squeezing this into v1.22? It's ready to go otherwise (2 approvals)

@MrAlias MrAlias merged commit 4491b39 into open-telemetry:main Jan 12, 2024
24 of 25 checks passed
@lizthegrey lizthegrey deleted the lizf.trace-slices-grow branch January 12, 2024 22:02
@XSAM XSAM added this to the Old Untracked PRs milestone Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants