-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTEP 149: Base-2 exponential histogram mapping reference implementation #179
Conversation
NrSketch lookup method is slightly different. Its indices array has 2x entries as the boundary array. Because of the denser indices array, it only need to do one correction on "rough". In the "go" code context like https://github.com/open-telemetry/oteps/pull/179/files#diff-5d6d040c9f07b433ed472d22341f4f4b598113cd3ee2d08b3c21d655a8e1fb61R132, it would need to only check "mantissa >= lt.boundaries[rough+1]". It does not need to look at [rough+2]. This is a trade off of cpu vs memory. NrSketch saves cpu time, at the cost of memory. |
I am not sure if the NrSketch approach is really signifcantly faster. Since the accessed elements are likely in the same cache line the extra array lookup will be very cheap. Furthermore, I expect that modern compilers parallelize the comparisons using SIMD instructions and remove the branching. It might even be that the NrSketch is slower as the lookup array is larger, which increases the chance of cache misses. |
Both of your methods are blazingly fast--even the method based on logarithm() is relatively speedy. |
I'm not good at Golang so I didn't look into the code, the description in |
Co-authored-by: Reiley Yang <reyang@microsoft.com>
To monitor events 1ms or longer (like web server request), even the log method is fine. 1ms is 1M ns. So an extra 10ns from the log method is so tiny. Adding an actual histogram around the mapping method, it will cost around 20ns per insert (lookup method), and 30ns per insert (add "synchronized" on Java functions for concurrent access). With log method, these would be around 30 and 40ns per insert, still small for 1ms events. Benchmark data from NrSketch. I think the majority of OTel applications are in the 1ms or longer event range (application monitoring). At 1 microsecond per event, then 30ns monitoring cost (3% of 1microsecond) will begin to matter. These will be less common use cases. One thing I can think of is tracking per message send/recv time in a network router. At those level, we are closer to hardware and custom (possibly hardware assisted) monitoring will likely be needed. |
I've posted a link to the output of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very enlightening, thanks for this reference!!!
ExponentWidth = 11 | ||
|
||
// MantissaMask is the mask for the mantissa of an IEEE 754 | ||
// double-precision floating-point value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Peanut gallery comment: You know it's good when you start bit-shifting and referencing IEEE spec.
if f == 0 { | ||
return 0 | ||
} | ||
valueBits := math.Float64bits(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "grab floating point bits directly" a requirement for performance? I'm stale on JavaScript/other languages. Should we also document how to do this aspect per-language?
(also, I'm amused this method looks so similar in Java).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. During the 09/28/2021 Metrics SIG Meeting we've discussed if the Golang code should be part of the OTEP or we need to move it to a different place, not blocking this PR though.
I will re-open this when it is ready following the feedback. I plan to move the code to the otel-go-contrib repo and incorporate copies of it in some new text near the end of OTEP 149. |
FYI I moved the contents of this OTEP into open-telemetry/opentelemetry-specification#1935 |
Adds two implementations of the base-2 exponential histogram mapping from the review thread of open-telemetry/opentelemetry-proto#322, both for posterity and to record a program that generates the table of constants used in the LookupTable implementation.
cc: @yzhuge @oertl