Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTEP 149: Base-2 exponential histogram mapping reference implementation #179

Closed
wants to merge 11 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Sep 21, 2021

Adds two implementations of the base-2 exponential histogram mapping from the review thread of open-telemetry/opentelemetry-proto#322, both for posterity and to record a program that generates the table of constants used in the LookupTable implementation.

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/oteps/text/metrics/0149
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkHistogram/mapping_lookup_3-12         	100000000	        10.40 ns/op
BenchmarkHistogram/boundary_lookup_3-12        	80854910	        14.96 ns/op
BenchmarkHistogram/mapping_logarithm_3-12      	60637861	        20.04 ns/op
BenchmarkHistogram/boundary_logarithm_3-12     	137750319	         8.696 ns/op
BenchmarkHistogram/mapping_lookup_10-12        	100000000	        10.30 ns/op
BenchmarkHistogram/boundary_lookup_10-12       	80481817	        14.98 ns/op
BenchmarkHistogram/mapping_logarithm_10-12     	60081852	        19.97 ns/op
BenchmarkHistogram/boundary_logarithm_10-12    	138988699	         8.708 ns/op

cc: @yzhuge @oertl

@yzhuge
Copy link
Contributor

yzhuge commented Sep 22, 2021

NrSketch lookup method is slightly different. Its indices array has 2x entries as the boundary array. Because of the denser indices array, it only need to do one correction on "rough".
See code at https://github.com/newrelic-experimental/newrelic-sketch-java/blob/main/src/main/java/com/newrelic/nrsketch/indexer/SubBucketLookupIndexer.java#L106
and doc at https://github.com/newrelic-experimental/newrelic-sketch-java/blob/main/Indexer.md#how-the-lookup-indexer-works

In the "go" code context like https://github.com/open-telemetry/oteps/pull/179/files#diff-5d6d040c9f07b433ed472d22341f4f4b598113cd3ee2d08b3c21d655a8e1fb61R132, it would need to only check "mantissa >= lt.boundaries[rough+1]". It does not need to look at [rough+2].

This is a trade off of cpu vs memory. NrSketch saves cpu time, at the cost of memory.

@oertl
Copy link
Contributor

oertl commented Sep 23, 2021

@yzhuge

This is a trade off of cpu vs memory. NrSketch saves cpu time, at the cost of memory.

I am not sure if the NrSketch approach is really signifcantly faster. Since the accessed elements are likely in the same cache line the extra array lookup will be very cheap. Furthermore, I expect that modern compilers parallelize the comparisons using SIMD instructions and remove the branching. It might even be that the NrSketch is slower as the lookup array is larger, which increases the chance of cache misses.

@jmacd
Copy link
Contributor Author

jmacd commented Sep 23, 2021

Both of your methods are blazingly fast--even the method based on logarithm() is relatively speedy.
I clock @oertl's method at 10ns and the logarithm method at 20ns. We have diminishing returns here, it's a good thing!

@reyang
Copy link
Member

reyang commented Sep 23, 2021

I'm not good at Golang so I didn't look into the code, the description in README.md looks good to me (with some minor comments).

jmacd and others added 3 commits September 23, 2021 11:56
Co-authored-by: Reiley Yang <reyang@microsoft.com>
@yzhuge
Copy link
Contributor

yzhuge commented Sep 23, 2021

To monitor events 1ms or longer (like web server request), even the log method is fine. 1ms is 1M ns. So an extra 10ns from the log method is so tiny. Adding an actual histogram around the mapping method, it will cost around 20ns per insert (lookup method), and 30ns per insert (add "synchronized" on Java functions for concurrent access). With log method, these would be around 30 and 40ns per insert, still small for 1ms events. Benchmark data from NrSketch.

I think the majority of OTel applications are in the 1ms or longer event range (application monitoring). At 1 microsecond per event, then 30ns monitoring cost (3% of 1microsecond) will begin to matter. These will be less common use cases. One thing I can think of is tracking per message send/recv time in a network router. At those level, we are closer to hardware and custom (possibly hardware assisted) monitoring will likely be needed.

@jmacd
Copy link
Contributor Author

jmacd commented Sep 27, 2021

I've posted a link to the output of generate 16 i.e. a large precomputed lookup table that could be useful for validation.
https://gist.github.com/jmacd/64031dc003410db6eb3ae04a1db286fd

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very enlightening, thanks for this reference!!!

ExponentWidth = 11

// MantissaMask is the mask for the mantissa of an IEEE 754
// double-precision floating-point value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peanut gallery comment: You know it's good when you start bit-shifting and referencing IEEE spec.

if f == 0 {
return 0
}
valueBits := math.Float64bits(f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "grab floating point bits directly" a requirement for performance? I'm stale on JavaScript/other languages. Should we also document how to do this aspect per-language?

(also, I'm amused this method looks so similar in Java).

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. During the 09/28/2021 Metrics SIG Meeting we've discussed if the Golang code should be part of the OTEP or we need to move it to a different place, not blocking this PR though.

jmacd added a commit to jmacd/otlp-expo-histo that referenced this pull request Sep 29, 2021
@jmacd
Copy link
Contributor Author

jmacd commented Sep 29, 2021

I will re-open this when it is ready following the feedback. I plan to move the code to the otel-go-contrib repo and incorporate copies of it in some new text near the end of OTEP 149.

@jmacd jmacd closed this Sep 29, 2021
@jmacd
Copy link
Contributor Author

jmacd commented Oct 5, 2021

FYI I moved the contents of this OTEP into open-telemetry/opentelemetry-specification#1935

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants