Skip to content

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Aug 12, 2025

Re-opening #5028 with new benchmarks. For cases with 10 attributes, this reduces the overhead of metric measurements by ~80-90% (depending on lock contention). It introduces a small probability of collision for attribute sets in the metrics SDK. For an instrument with 1 million different attribute sets, the probability of a collision is approximately 2 * 10^-8. For a more "normal" cardinality of 1000 on an instrument, it is approximately 2 * 10^-17.

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/attribute
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                            │   main.txt    │              hash.txt               │
                                            │    sec/op     │    sec/op     vs base               │
EquivalentMapAccess/Empty-24                    32.01n ± 2%   10.12n ±  4%  -68.37% (p=0.002 n=6)
EquivalentMapAccess/1_string_attribute-24      106.25n ± 2%   10.01n ±  5%  -90.58% (p=0.002 n=6)
EquivalentMapAccess/10_string_attributes-24   826.250n ± 1%   9.982n ± 11%  -98.79% (p=0.002 n=6)
EquivalentMapAccess/1_int_attribute-24         106.65n ± 2%   10.13n ±  3%  -90.50% (p=0.002 n=6)
EquivalentMapAccess/10_int_attributes-24       833.25n ± 2%   10.04n ±  5%  -98.80% (p=0.002 n=6)
geomean                                         190.3n        10.06n        -94.72%

Parallel benchmarks:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                                         │  main24.txt   │              new24.txt              │
                                                         │    sec/op     │    sec/op     vs base               │
SyncMeasure/NoView/Int64Counter/Attributes/0-24             288.4n ± 13%   267.0n ± 16%        ~ (p=0.180 n=6)
SyncMeasure/NoView/Int64Counter/Attributes/1-24             372.7n ± 24%   303.3n ±  6%  -18.61% (p=0.002 n=6)
SyncMeasure/NoView/Int64Counter/Attributes/10-24           1862.5n ± 11%   302.2n ±  6%  -83.77% (p=0.002 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/0-24           288.2n ±  5%   291.8n ± 14%        ~ (p=0.589 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/1-24           374.8n ± 22%   326.2n ± 15%  -12.98% (p=0.002 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/10-24         1984.0n ± 10%   277.9n ± 15%  -85.99% (p=0.002 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/0-24       286.8n ± 13%   279.4n ± 14%        ~ (p=0.818 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/1-24       415.4n ± 14%   309.5n ± 11%  -25.47% (p=0.002 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/10-24     1923.0n ± 19%   294.1n ± 17%  -84.71% (p=0.002 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/0-24     284.9n ±  5%   271.6n ± 11%        ~ (p=0.240 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/1-24     382.9n ± 23%   295.7n ± 13%  -22.78% (p=0.002 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/10-24   1787.0n ± 28%   289.2n ± 12%  -83.81% (p=0.002 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/0-24           283.4n ±  8%   269.9n ±  9%        ~ (p=0.589 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/1-24           300.7n ±  8%   270.1n ± 15%  -10.16% (p=0.026 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/10-24         1046.8n ± 24%   299.2n ± 16%  -71.42% (p=0.002 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/0-24         264.3n ± 12%   295.9n ±  5%  +11.93% (p=0.026 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/1-24         321.0n ±  8%   269.4n ± 11%  -16.09% (p=0.002 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/10-24       1052.2n ± 10%   274.6n ±  5%  -73.90% (p=0.002 n=6)
geomean                                                     540.0n         287.7n        -46.72%

Single-threaded benchmarks:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
                                                      │   main1.txt   │              new1.txt               │
                                                      │    sec/op     │    sec/op     vs base               │
SyncMeasure/NoView/Int64Counter/Attributes/0            130.95n ±  1%   97.99n ± 21%  -25.17% (p=0.002 n=6)
SyncMeasure/NoView/Int64Counter/Attributes/1             300.8n ±  7%   104.6n ±  3%  -65.21% (p=0.002 n=6)
SyncMeasure/NoView/Int64Counter/Attributes/10           1646.0n ±  2%   105.8n ±  2%  -93.58% (p=0.002 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/0          132.65n ±  1%   99.28n ±  4%  -25.16% (p=0.002 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/1           295.4n ±  3%   107.7n ±  3%  -63.54% (p=0.002 n=6)
SyncMeasure/NoView/Float64Counter/Attributes/10         1620.0n ±  1%   109.6n ±  4%  -93.23% (p=0.002 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/0      132.85n ± 80%   99.34n ±  1%  -25.22% (p=0.002 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/1       300.4n ±  1%   106.0n ±  1%  -64.71% (p=0.002 n=6)
SyncMeasure/NoView/Int64UpDownCounter/Attributes/10     1622.0n ±  1%   105.8n ±  1%  -93.48% (p=0.002 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/0    134.90n ± 51%   99.16n ±  4%  -26.49% (p=0.002 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/1     312.4n ± 34%   107.8n ±  2%  -65.51% (p=0.002 n=6)
SyncMeasure/NoView/Float64UpDownCounter/Attributes/10   1613.0n ± 23%   106.1n ±  1%  -93.43% (p=0.002 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/0          103.50n ± 17%   88.53n ±  1%  -14.46% (p=0.002 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/1          199.50n ± 16%   95.44n ±  2%  -52.16% (p=0.002 n=6)
SyncMeasure/NoView/Int64Histogram/Attributes/10         878.70n ±  2%   95.78n ±  2%  -89.10% (p=0.002 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/0        108.55n ± 54%   88.45n ±  1%  -18.51% (p=0.002 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/1        257.30n ± 14%   95.05n ±  2%  -63.06% (p=0.002 n=6)
SyncMeasure/NoView/Float64Histogram/Attributes/10       882.70n ± 18%   96.28n ±  1%  -89.09% (p=0.002 n=6)
geomean                                                  355.2n         100.3n        -71.77%

Copy link

codecov bot commented Aug 12, 2025

Codecov Report

❌ Patch coverage is 90.90909% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.4%. Comparing base (666f95c) to head (41b0468).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
attribute/hash.go 87.2% 7 Missing ⚠️
attribute/set.go 89.1% 3 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #7175   +/-   ##
=====================================
  Coverage   85.4%   85.4%           
=====================================
  Files        269     271    +2     
  Lines      24297   24391   +94     
=====================================
+ Hits       20762   20843   +81     
- Misses      3158    3170   +12     
- Partials     377     378    +1     
Files with missing lines Coverage Δ
attribute/internal/fnv/fnv.go 100.0% <100.0%> (ø)
attribute/set.go 83.0% <89.1%> (-1.2%) ⬇️
attribute/hash.go 87.2% <87.2%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to consider what is an acceptable level of hash collision here. We may want to use a larger bit size for the hash or look into ways to avoid the collision if we determine the collision rate too high. We may also consider this approach acceptable.

@MrAlias
Copy link
Contributor

MrAlias commented Aug 14, 2025

For reference, Prometheus has used as similar size hash for their implementation. There haven't been any known issues related to collisions there and it has been out for ~ 10 years.

@juliusv
Copy link
Contributor

juliusv commented Aug 15, 2025

Some input from the sidelines (and yay on the performance improvements! 🥳):

For reference, Prometheus has used as similar size hash for their implementation. There haven't been any known issues related to collisions there and it has been out for ~ 10 years.

It looks like collision handling was added to client_golang around a decade ago: prometheus/client_golang#221

Besides the client library, we have seen some real collisions over the years in the Prometheus server that have all been addressed:

  • Over a decade ago, a server with millions of series at SoundCloud ran into hash collisions. I think we used an FNV-1a hash back then to differentiate series within the TSDB. All this has completely changed by now, so the TSDB is resistant against this now.
  • Another real hash collision happened more recently in the PromQL engine in Prometheus: hash collision in engine leads to incorrect "multiple matches for labels" error prometheus/prometheus#5724. But as far as I can tell, this was due to a bad custom way of how we hashed labels together.

We still use hashes without collision checking in some places, which are hopefully unlikely enough to ever cause a collision. See also prometheus/prometheus#12519 for more discussion on hash collision potential in Prometheus.

Copy link

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks! Looks good modulo previous comments around runes/bytes on hash string and unmarshalling questions 👍🏽

@dashpole dashpole marked this pull request as ready for review August 19, 2025 17:56
@MrAlias
Copy link
Contributor

MrAlias commented Aug 26, 2025

@dashpole are we going to look into collision remediation techniques based on the Prometheus groups experience?

@dashpole
Copy link
Contributor Author

Yes, I will look into what was done in prometheus/client_golang#221.

@dashpole dashpole marked this pull request as draft August 27, 2025 13:30
@dashpole
Copy link
Contributor Author

The TL;DR is that handling collisions just means using a map[uint64][]metricWithLabelValues instead of a map[uint64]metricWithLabelValues to store aggregations (note the slice), and iterating over elements to find the exact one. Looks fairly straightforward. See https://github.com/prometheus/client_golang/blob/84a4734fe73aecc9ec36e1fbcb09c9a2d679f392/prometheus/vec.go#L319

I think the biggest difference for the attributes package is that we would need to ensure that Equals() correctly handles hash collisions (i.e. it should check more than that the set's Distinct are the same). Alternatively, we could switch to the "Add a new Hash() function" approach since that keeps a correct Equals() implementation. I'm not sure if the current Distinct implementation has poor == performance or not...

dashpole added a commit that referenced this pull request Aug 28, 2025
Forked from #7175

```
$ go test -timeout 60s -run=xxxxxMatchNothingxxxxx -test.benchtime=10ms -count 6 -cpu 1 -bench=BenchmarkEquals ./...
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/attribute
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkEquals/Empty         	 1314884	         9.008 ns/op
BenchmarkEquals/Empty         	 1875915	         8.461 ns/op
BenchmarkEquals/Empty         	 1461613	         7.749 ns/op
BenchmarkEquals/Empty         	 1636912	         9.359 ns/op
BenchmarkEquals/Empty         	 1863039	         6.355 ns/op
BenchmarkEquals/Empty         	 1789053	         6.336 ns/op
BenchmarkEquals/1_string_attribute         	  674168	        16.92 ns/op
BenchmarkEquals/1_string_attribute         	  701983	        16.42 ns/op
BenchmarkEquals/1_string_attribute         	  692001	        16.52 ns/op
BenchmarkEquals/1_string_attribute         	  687970	        16.29 ns/op
BenchmarkEquals/1_string_attribute         	  751766	        16.58 ns/op
BenchmarkEquals/1_string_attribute         	  703534	        16.88 ns/op
BenchmarkEquals/10_string_attributes       	   85400	       137.1 ns/op
BenchmarkEquals/10_string_attributes       	   91045	       136.1 ns/op
BenchmarkEquals/10_string_attributes       	   90973	       150.7 ns/op
BenchmarkEquals/10_string_attributes       	   62877	       177.5 ns/op
BenchmarkEquals/10_string_attributes       	   90780	       194.2 ns/op
BenchmarkEquals/10_string_attributes       	   91058	       144.6 ns/op
BenchmarkEquals/1_int_attribute            	  624625	        18.72 ns/op
BenchmarkEquals/1_int_attribute            	  689478	        16.03 ns/op
BenchmarkEquals/1_int_attribute            	  719173	        15.68 ns/op
BenchmarkEquals/1_int_attribute            	  707005	        16.18 ns/op
BenchmarkEquals/1_int_attribute            	  752048	        15.94 ns/op
BenchmarkEquals/1_int_attribute            	  752034	        16.23 ns/op
BenchmarkEquals/10_int_attributes          	   90302	       132.5 ns/op
BenchmarkEquals/10_int_attributes          	   89929	       131.9 ns/op
BenchmarkEquals/10_int_attributes          	   86578	       135.2 ns/op
BenchmarkEquals/10_int_attributes          	   90482	       133.1 ns/op
BenchmarkEquals/10_int_attributes          	   90255	       132.0 ns/op
BenchmarkEquals/10_int_attributes          	   87615	       134.6 ns/op
PASS
ok  	go.opentelemetry.io/otel/attribute	0.578s
PASS
ok  	go.opentelemetry.io/otel/attribute/internal	0.017s
```

---------

Co-authored-by: Flc゛ <four_leaf_clover@foxmail.com>
Co-authored-by: Robert Pająk <pellared@hotmail.com>
@dashpole dashpole force-pushed the hash_distinct branch 3 times, most recently from a69ecb9 to 431c5ed Compare August 28, 2025 19:07
dashpole added a commit that referenced this pull request Aug 29, 2025
Forked from
#7175 (comment)

This adds a test for JSON marshaling of attribute sets.

---------

Co-authored-by: Flc゛ <four_leaf_clover@foxmail.com>
Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
@dashpole dashpole force-pushed the hash_distinct branch 3 times, most recently from a3fb9c7 to ebbb889 Compare August 29, 2025 18:27
@dashpole
Copy link
Contributor Author

dashpole commented Aug 29, 2025

There is a significant performance cost (the cost of set.Equals from #7262) to check for duplicates, but it is still a huge improvement over the current state. I'm trying to add a unit test with a collision, but it is taking a while for the fuzzer to generate a collision for me... Even with a cardinality of 100 million attributes, I didn't hit one :). I updated the benchmark results in the description.

@bboreham
Copy link
Contributor

bboreham commented Sep 11, 2025

For reference, Prometheus has used as similar size hash for their implementation. There haven't been any known issues related to collisions there and it has been out for ~ 10 years.

Julius already cited this, but I too want to note there were issues:

Prometheus has generally moved from fnv to xxhash which is much better behaved.

@dashpole
Copy link
Contributor Author

I plan to investigate xxHash in a separate issue. From my brief research, it is primarily designed for efficient hashing of large amounts of input data (e.g. checksums for files). It should still be faster than fnv64-1a for our inputs, but the difference may or may not be significant. I've seen conflicting statements for which algorithm has a better distribution of hash values, although most seem to favor xxHash as well. I don't think that should block this PR considering the substantial performance benefits it brings.

@dashpole dashpole merged commit 9d52bde into open-telemetry:main Sep 16, 2025
33 checks passed
@dashpole dashpole deleted the hash_distinct branch September 16, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants