-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[prometheusremotewriteexporter] reduce allocations when serializing protobufs #35185
[prometheusremotewriteexporter] reduce allocations when serializing protobufs #35185
Conversation
Looks great! Are you able to add a benchmark to demonstrate the improvement and make sure it doesn't regress in the future? cc @jmichalek132 @ArthurSens feel free to review as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but could you please look at the failing compliance tests? So turns out that should be fixed now #35119 so rebasing should hopefully make them pass.
By any chance is there any visible improvement you can see with this change from the metrics of the collector, if so could you post them here?
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
sorry for the staleness, I'd still like to get this PR merged (was busy with work). I will provide a benchmark to get this PR moving along |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
I'm taking a look at this today, I took the liberty to write a benchmark for func BenchmarkExecute(b *testing.B) {
for _, numSample := range []int{100, 1000, 10000} {
b.Run(fmt.Sprintf("numSample=%d", numSample), func(b *testing.B) {
benchmarkExecute(b, numSample)
})
}
}
func benchmarkExecute(b *testing.B, numSample int) {
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusOK)
}))
defer mockServer.Close()
endpointURL, err := url.Parse(mockServer.URL)
require.NoError(b, err)
// Create the prwExporter
exporter := &prwExporter{
endpointURL: endpointURL,
client: http.DefaultClient,
}
generateSamples := func(n int) []prompb.Sample {
samples := make([]prompb.Sample, 0, n)
for i := 0; i < n; i++ {
samples = append(samples, prompb.Sample{
Timestamp: int64(i),
Value: float64(i),
})
}
return samples
}
reqs := make([]*prompb.WriteRequest, 0, b.N)
const labelValue = "abcdefg'hijlmn234!@#$%^&*()_+~`\"{}[],./<>?hello0123hiOlá你好Dzieńdobry9Zd8ra765v4stvuyte"
for n := 0; n < b.N; n++ {
num := strings.Repeat(strconv.Itoa(n), 16)
req := &prompb.WriteRequest{
Timeseries: []prompb.TimeSeries{{
Samples: generateSamples(numSample),
Labels: []prompb.Label{
{Name: "__name__", Value: "test_metric"},
{Name: "test_label_name_" + num, Value: labelValue + num},
}}},
}
reqs = append(reqs, req)
}
ctx := context.Background()
b.ReportAllocs()
b.ResetTimer()
for _, req := range reqs {
err := exporter.execute(ctx, req)
require.NoError(b, err)
}
} Once you have the time, could you include it in your PR? Maybe the input could be improved as well, I'm only generating samples and labels, but we could include metadata and histograms as well for completeness. The results are looking good :) $ benchstat base=main.txt new=pr-35185.txt
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter
cpu: Apple M2 Pro
│ base │ new │
│ sec/op │ sec/op vs base │
Execute/numSample=100-2 50.96µ ± 6% 50.40µ ± 8% ~ (p=0.105 n=10)
Execute/numSample=1000-2 79.25µ ± 1% 77.20µ ± 1% -2.59% (p=0.000 n=10)
Execute/numSample=10000-2 332.9µ ± 3% 309.8µ ± 1% -6.95% (p=0.000 n=10)
geomean 110.4µ 106.4µ -3.58%
│ base │ new │
│ B/op │ B/op vs base │
Execute/numSample=100-2 9.824Ki ± 0% 6.077Ki ± 0% -38.15% (p=0.000 n=10)
Execute/numSample=1000-2 42.85Ki ± 0% 10.85Ki ± 0% -74.68% (p=0.000 n=10)
Execute/numSample=10000-2 342.11Ki ± 0% 38.19Ki ± 0% -88.84% (p=0.000 n=10)
geomean 52.42Ki 13.61Ki -74.04%
│ base │ new │
│ allocs/op │ allocs/op vs base │
Execute/numSample=100-2 81.00 ± 0% 79.00 ± 0% -2.47% (p=0.000 n=10)
Execute/numSample=1000-2 85.00 ± 0% 83.00 ± 0% -2.35% (p=0.000 n=10)
Execute/numSample=10000-2 85.00 ± 0% 83.00 ± 0% -2.35% (p=0.000 n=10)
geomean 83.65 81.64 -2.39% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArthurSens thank you for the benchmark! I'll get on it this week. |
38ca486
to
f2b6aac
Compare
f2b6aac
to
f8f1d41
Compare
@ArthurSens added the benchmark, PTAL! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
We still need an approver from a codeowner though, maybe @dashpole?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…rotobufs (open-telemetry#35185) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> There are two allocations that happen on every push 1. Serializing the remote write protobuf to a byte array. 2. Compressing the byte array with snappy. Since these buffers can be quite large, we can save some allocations with a `sync.Pool`. **Link to tracking Issue:** <Issue number if applicable> **Testing:** Tests still pass. We have been running this successfully in production for a few months now. ``` │ /tmp/old.txt │ /tmp/new.txt │ │ sec/op │ sec/op vs base │ Execute/numSample=100-14 43.11µ ± 2% 43.09µ ± 2% ~ (p=0.853 n=10) Execute/numSample=1000-14 105.4µ ± 1% 102.2µ ± 1% -3.04% (p=0.000 n=10) Execute/numSample=10000-14 685.5µ ± 1% 663.6µ ± 5% -3.19% (p=0.023 n=10) geomean 146.0µ 143.0µ -2.10% │ /tmp/old.txt │ /tmp/new.txt │ │ B/op │ B/op vs base │ Execute/numSample=100-14 14.809Ki ± 0% 6.091Ki ± 0% -58.87% (p=0.000 n=10) Execute/numSample=1000-14 94.18Ki ± 0% 22.16Ki ± 0% -76.47% (p=0.000 n=10) Execute/numSample=10000-14 726.23Ki ± 0% 39.83Ki ± 0% -94.52% (p=0.000 n=10) geomean 100.4Ki 17.52Ki -82.56% │ /tmp/old.txt │ /tmp/new.txt │ │ allocs/op │ allocs/op vs base │ Execute/numSample=100-14 81.00 ± 0% 79.00 ± 0% -2.47% (p=0.000 n=10) Execute/numSample=1000-14 85.00 ± 0% 83.00 ± 0% -2.35% (p=0.000 n=10) Execute/numSample=10000-14 85.00 ± 0% 83.00 ± 0% -2.35% (p=0.000 n=10) geomean 83.65 81.64 -2.39% ``` **Documentation:** <Describe the documentation added.> --------- Co-authored-by: David Ashpole <dashpole@google.com>
Description:
There are two allocations that happen on every push
Since these buffers can be quite large, we can save some allocations with a
sync.Pool
.Link to tracking Issue:
Testing:
Tests still pass. We have been running this successfully in production for a few months now.
Documentation: