-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize SeriesGrouper & aggregators.merge #8391
Conversation
266b36a
to
dc0648c
Compare
I haven't a clue whats up with the test failure. macdeps:
... No it's not |
I took a look at the macdeps failure but I haven't figured out yet why it's failing. First I reran it and it failed again. I noticed when rerunning it that macdeps installs go version 1.13.6 when it intends to install the latest go package from homebrew. Maybe 'go mod tidy' fails only on go 1.13.6? (The other CI steps use 1.14 or 1.15.) |
dc0648c
to
de28732
Compare
Ah, that would explain it. maphash was introduced in 1.14. Given that |
I just noticed that hash/maphash is new in 1.14 too. Telegraf is currently supporting go 1.14 and 1.15 even though go.mod has the version set to 1.15. Fortunately the 1.14 CI steps are working even with the go.mod version set to 1.15. To fix this failure we'll have to fix the macdeps job to use a newer golang version. I think it's just a matter of updating .circleci/config.yml to run 'brew update' before running 'brew install go'. Could you make that change and try it out in this PR? |
It looks like that worked. Unfortunately 'brew update' took over four minutes to run. We may need to look for a faster way to get an up to date golang install, but you don't need to worry about that for this PR. Thanks for your help debugging CI! |
Yay.
Maybe it should use the official package/archive. I know I've seen issues on the golang issue tracker where they won't support people using the homebrew package, and tell these people to go reproduce issues using the official package. I'm unfamiliar with the background, but some googling suggests that homebrew does janky things with the package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Maybe we could move the seed init to the NewSeriesGrouper()
call? This would get rid of the global variable and generates an individual seed per grouper if I'm not mistaken...
Could we also move the CircleCI fix out here? It's just a minor thing, but nobody will afterwards search this PR for that... :-)
I updated the golang version for mac ci in #8516. Could you remove the 'brew update' from .circleci/config.yml and rebase this PR? Thanks! |
@phemmer any news? |
The previous implementation of SeriesGrouper required breaking a metric object apart into its constituents, converting tags and keys into unoptimized maps, only to have it put them back together into another metric object. This resulted in a significant performance overhead. This overhead was further compounded when the number of fields was large. This change adds a new AddMetric method to SeriesGrouper which preserves the metric object and removes the back-and-forth conversion. Additionlly the method used for calculating the metric's hash was switched to use maphash, which is optimized for this case. ---- Benchmarks Before: BenchmarkMergeOne-16 106012 11790 ns/op BenchmarkMergeTwo-16 48529 24819 ns/op BenchmarkGroupID-16 780018 1608 ns/op After: BenchmarkMergeOne-16 907093 1173 ns/op BenchmarkMergeTwo-16 508321 2168 ns/op BenchmarkGroupID-16 11217788 99.4 ns/op
90cd3e8
to
3e1a2a7
Compare
Updated to address comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Great work, thanks! |
The previous implementation of SeriesGrouper required breaking a metric object apart into its constituents, converting tags and keys into unoptimized maps, only to have it put them back together into another metric object. This resulted in a significant performance overhead. This overhead was further compounded when the number of fields was large. This change adds a new AddMetric method to SeriesGrouper which preserves the metric object and removes the back-and-forth conversion. Additionlly the method used for calculating the metric's hash was switched to use maphash, which is optimized for this case. ---- Benchmarks Before: BenchmarkMergeOne-16 106012 11790 ns/op BenchmarkMergeTwo-16 48529 24819 ns/op BenchmarkGroupID-16 780018 1608 ns/op After: BenchmarkMergeOne-16 907093 1173 ns/op BenchmarkMergeTwo-16 508321 2168 ns/op BenchmarkGroupID-16 11217788 99.4 ns/op (cherry picked from commit 910b726)
The previous implementation of SeriesGrouper required breaking a metric object apart into its constituents, converting tags and keys into unoptimized maps, only to have it put them back together into another metric object. This resulted in a significant performance overhead. This overhead was further compounded when the number of fields was large. This change adds a new AddMetric method to SeriesGrouper which preserves the metric object and removes the back-and-forth conversion. Additionlly the method used for calculating the metric's hash was switched to use maphash, which is optimized for this case. ---- Benchmarks Before: BenchmarkMergeOne-16 106012 11790 ns/op BenchmarkMergeTwo-16 48529 24819 ns/op BenchmarkGroupID-16 780018 1608 ns/op After: BenchmarkMergeOne-16 907093 1173 ns/op BenchmarkMergeTwo-16 508321 2168 ns/op BenchmarkGroupID-16 11217788 99.4 ns/op
This introduces some optimizations for use with the aggregators.merge plugin. In my current usage of telegraf, it's heavily bottle-necking on this plugin.
The previous implementation of
SeriesGrouper
required breaking a metric object apart into its constituents, converting tags and keys into unoptimized maps, only to have it put them back together into another metric object. This resulted in a significant performance overhead. This overhead was further compounded when the number of fields was large.This change adds a new
AddMetric
method toSeriesGrouper
which preserves the metric object and removes the back-and-forth conversion.Additionlly the method used for calculating the metric's hash was switched to use
maphash
, which is optimized for this case.Benchmarks
Before:
After:
Note the
BenchmarkGroupID
isn't a direct apples-to-apples comparison, since the old method had the sort code present, and the new one does not. However earlier in my development, switching just the hash, while keeping all the other code resulted in about a 2x performance increase (831 ns/op).Also since the vast majority of metrics are likely not to result in any mergers, a further optimization might be to utilize copy-on-write. Meaning use the input metric without copying it, and if a second metric is merged into it, then copy it. However the current change is a good first step, and that can be investigated later.
Required for all PRs: