Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ExponentialHistogram to Metrics data model #1935

Merged
merged 26 commits into from
Oct 7, 2021
Merged
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
mapping methods
  • Loading branch information
jmacd committed Oct 4, 2021
commit 1d2579f6aaff392315881524b56dd9c95f15534f
101 changes: 91 additions & 10 deletions specification/metrics/datamodel.md
Original file line number Diff line number Diff line change
Expand Up @@ -545,19 +545,100 @@ limit the width of integers used in standard processing pipelines such
as the OpenTelemetry collector. The wire-level protocol could be
extended for 64-bit bucket indices in a future release.

Producers MAY use a built-in logarithm function to calculate the
bucket index of a value. The use of a built-in logarithm function
could lead to results that differ from the bucket index that would be
computed using arbitrary precision or a lookup table, however
producers are not required to perform an exact computation. As a
result, ExponentialHistogram exemplars could map into buckets with
zero count. We expect to find such values counted in the adjacent
bucket.
Producers use a mapping function to compute bucket indices. Producers
are presumed to support IEEE double-width floating-point numbers with
11-bit exponent and 52-bit significand. The pseudo-code below for
mapping values to exponents refers to the following constants:

```golang
const (
// SignificandWidth is the size of an IEEE 754 double-precision
// floating-point significand.
SignificandWidth = 52
// ExponentWidth is the size of an IEEE 754 double-precision
// floating-point exponent.
ExponentWidth = 11

// SignificandMask is the mask for the significand of an IEEE 754
// double-precision floating-point value: 0xFFFFFFFFFFFFF.
SignificandMask = 1<<SignificandWidth - 1

// ExponentBias is the exponent bias specified for encoding
// the IEEE 754 double-precision floating point exponent: 1023.
ExponentBias = 1<<(ExponentWidth-1) - 1

// ExponentMask are set to 1 for the bits of an IEEE 754
// floating point exponent: 0x7FF0000000000000.
ExponentMask = ((1 << ExponentWidth) - 1) << SignificandWidth
)
```

The following choices of mapping function have been validated through
reference implementations:

1. For scale zero, the index of a value equals its normalized base-2
exponent, meaning the value of _exponent_ in the base-2 fractional
representation `1._significand_ * 2**_exponent_`. Normal IEEE 754
double-width floating point values have indices in the range
[-1022,+1023] and subnormal values have indices in the range
[-1074,-1023]. This may be written as:

```golang
// GetExponent extracts the normalized base-2 fractional exponent.
// Let the value be represented as `1.significand x 2**exponent`,
// this returns `exponent`. Not defined for 0, Inf, or NaN values.
func GetExponent(value float64) int32 {
rawBits := math.Float64bits(value)
rawExponent := (int64(rawBits) & ExponentMask) >> SignificandWidth
rawSignificand := rawBits & SignificandMask
if rawExponent == 0 {
// Handle subnormal values: rawSignificand cannot be zero
// unless value is zero.
rawExponent -= int64(bits.LeadingZeros64(rawSignificand) - 12)
}
return int32(rawExponent - ExponentBias)
}
```

2. For negative scales, the index of a value equals the normalized
base-2 exponent (as by `GetExponent()` above) shifted to the right
by `-scale`. Note that because of sign extension, this shift performs
correct rounding for the negative indices. This may be written as:

```golang
return GetExponent(value) << -scale
jmacd marked this conversation as resolved.
Show resolved Hide resolved
```

3. For positive scales, use of the built-in natural logarithm
function. A multiplicative factor equal to `2**scale / ln(2)`
proves useful (where `ln()` is the natural logarithm), for example:

```golang
scaleFactor := math.Log2E * math.Exp2(1<<scale)
jmacd marked this conversation as resolved.
Show resolved Hide resolved
return int64(math.Floor(math.Log(value) * scaleFactor))
```

4. For positive scales, lookup table methods have been demonstrated
that are able to exactly compute the index in constant time from a
lookup table with `O(2**scale)` entries.

For positive scales, the logarithm method is preferred because it
requires very little code to validate and is nearly as fast and
accurate as the lookup table approach.

The use of a built-in logarithm function could lead to results that
differ from the bucket index that would be computed using arbitrary
precision or a lookup table, however producers are not required to
perform an exact computation. As a result, ExponentialHistogram
exemplars could map into buckets with zero count. We expect to find
such values counted in the adjacent buckets.

jmacd marked this conversation as resolved.
Show resolved Hide resolved
#### Consumer expectations

ExponentialHistogram buckets are expected to map into numbers that can
be represented using IEEE 754 double-width floating point values.
ExponentialHistogram bucket indices are expected to map into buckets
where both the uppwer and lower boundaries that can be represented
jmacd marked this conversation as resolved.
Show resolved Hide resolved
using IEEE 754 double-width floating point values.

Consumers SHOULD reject ExponentialHistogram data with `scale` and
bucket indices that overflow or underflow this representation.
Consumers that reject such data SHOULD warn the user through error
Expand Down