Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ExponentialHistogram to Metrics data model #1935

Merged
merged 26 commits into from
Oct 7, 2021
Merged
Changes from 11 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 155 additions & 5 deletions specification/metrics/datamodel.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@
* [Sums](#sums)
* [Gauge](#gauge)
* [Histogram](#histogram)
* [ExponentialHistogram](#exponentialhistogram)
+ [Exponential scale](#exponential-scale)
jmacd marked this conversation as resolved.
Show resolved Hide resolved
+ [Exponential buckets](#exponential-buckets)
+ [Zero count](#zero-count)
+ [Producer expectations](#producer-expectations)
+ [Consumer expectations](#consumer-expectations)
* [Summary (Legacy)](#summary-legacy)
- [Exemplars](#exemplars)
- [Single-Writer](#single-writer)
Expand Down Expand Up @@ -413,14 +419,158 @@ Changing the inclusivity and exclusivity of bounds is an example of
worst-case Histogram error; users should choose Histogram boundaries
so that worst-case error is within their error tolerance.

### ExponentialHistogram
jmacd marked this conversation as resolved.
Show resolved Hide resolved

**Status**: [Experimental](../document-status.md)

[ExponentialHistogram](https://github.com/open-telemetry/opentelemetry-proto/blob/cfbf9357c03bf4ac150a3ab3bcbe4cc4ed087362/opentelemetry/proto/metrics/v1/metrics.proto#L222)
data points are an alternate representation to the
[Histogram](#histogram) data point, used to convey a population of
recorded measurements in a compressed format. ExponentialHistogram
compresses bucket boundaries using an exponential formula, making it
suitable for conveying high dynamic range data with small relative
error, compared with alternative representations of similar size.

Statements about `Histogram` that refer to aggregation temporality,
attributes, and timestamps, as well as the `sum`, `count`, and
`exemplars` fields, are the same for `ExponentialHistogram`. These
fields all share identical interpretation as for `Histogram`, only the
bucket structure differs between these two types.

#### Exponential scale

The resolution of the ExponentialHistogram is characterized by a
parameter known as `scale`, with larger values of `scale` offering
greater precision. Bucket boundaries of the ExponentialHistogram are
located at integer powers of the `base`, also known as the "growth
factor", where:

```
base = 2**(2**(-scale))
```

The symbol `**` in these formulas represents exponentiation, thus
`2**x` is read "Two to the power of `x`", typically computed by an
expression like `math.Pow(2.0, x)`. Calculated `base` values for
selected scales are shown below:

| Scale | Base | Expression |
| -- | -- | -- |
| 10 | 1.00068 | 2**(1/1024) |
| 9 | 1.00135 | 2**(1/512) |
| 8 | 1.00271 | 2**(1/256) |
| 7 | 1.00543 | 2**(1/128) |
| 6 | 1.01089 | 2**(1/64) |
| 5 | 1.02190 | 2**(1/32) |
| 4 | 1.04427 | 2**(1/16) |
| 3 | 1.09051 | 2**(1/8) |
| 2 | 1.18921 | 2**(1/4) |
| 1 | 1.41421 | 2**(1/2) |
| 0 | 2 | 2**1 |
| -1 | 4 | 2**2 |
| -2 | 16 | 2**4 |
| -3 | 256 | 2**8 |
| -4 | 65536 | 2**16 |

An important property of this design is described as "perfect
subsetting". Buckets of an exponential Histogram with a given scale
map exactly into buckets of exponential Histograms with lesser scales,
which allows consumers to lower the resolution of a histogram (i.e.,
downscale) without introducing error.

#### Exponential buckets

The ExponentialHistogram bucket identified by `index`, a signed
integer, represents values in the population that are greater than or
equal to `base**index` and less than `base**(index+1)`. Note that the
ExponentialHistogram specifies a lower-inclusive bound while the
explicit-boundary Histogram specifies an upper-inclusive bound.
jmacd marked this conversation as resolved.
Show resolved Hide resolved

The positive and negative ranges of the histogram are expressed
separately. Negative values are mapped by their absolute value
into the negative range using the same scale as the positive range.

Each range of the ExponentialHistogram data point uses a dense
representation of the buckets, where a range of buckets is expressed
as a single `offset` value, a signed integer, and an array of count
values, where array element `i` represents the bucket count for bucket
index `offset+i`.

For a given range, positive or negative:

- Bucket index `0` counts measurements in the range `[1, base)`
- Positive indexes correspond with absolute values greater or equal to `base`
- Negative indexes correspond with absolute values less than 1
jmacd marked this conversation as resolved.
Show resolved Hide resolved
- There are `2**scale` buckets between successive powers of 2.

For example, with `scale=3` there are `2**3` buckets between 1 and 2.
Note that the lower boundary for bucket index 4 in a `scale=3`
histogram maps into the lower boundary for bucket index 2 in a
`scale=2` histogram and maps into the lower boundary for bucket index
1 (i.e., the `base`) in a `scale=1` histogram—these are examples of
perfect subsetting.

| `scale=3` bucket index | lower boundary | equation |
| -- | -- | -- |
| 0 | 1 | 2**(0/8) |
| 1 | 1.09051 | 2**(1/8) |
| 2 | 1.18921 | 2**(2/8), 2**(1/4) |
| 3 | 1.29684 | 2**(3/8) |
| 4 | 1.41421 | 2**(4/8), 2**(2/4), 2**(1/2) |
| 5 | 1.54221 | 2**(5/8) |
| 6 | 1.68179 | 2**(6/8) |
| 7 | 1.83401 | 2**(7/8) |

#### Zero count

The ExponentialHistogram contains a special `zero_count` field
containing the count of values that are either exactly zero or within
the region considered zero by the instrumentation at the tolerated
level of precision. This bucket stores values that cannot be
expressed using the standard exponential formula as well as values
that have been rounded to zero.

#### Producer expectations

The ExponentialHistogram design makes it possible to express values
that are too large or small to be represented in the 64 bit "double"
floating point format. Certain values for `scale`, while meaningful,
are not necessarily useful.

The range of data represented by an ExponentialHistogram determines
which scales can be usefully applied. Regardless of scale, producers
SHOULD ensure that the index of any encoded bucket falls within the
range of a signed 32-bit integer. This recommendation is applied to
limit the width of integers used in standard processing pipelines such
as the OpenTelemetry collector. The wire-level protocol could be
extended for 64-bit bucket indices in a future release.

Producers MAY use a built-in logarithm function to calculate the
bucket index of a value. The use of a built-in logarithm function
could lead to results that differ from the bucket index that would be
computed using arbitrary precision or a lookup table, however
producers are not required to perform an exact computation. As a
result, ExponentialHistogram exemplars could map into buckets with
zero count. We expect to find such values counted in the adjacent
bucket.

jmacd marked this conversation as resolved.
Show resolved Hide resolved
#### Consumer expectations

ExponentialHistogram buckets are expected to map into numbers that can
be represented using IEEE 754 double-width floating point values.
Consumers SHOULD reject ExponentialHistogram data with `scale` and
bucket indices that overflow or underflow this representation.
Consumers that reject such data SHOULD warn the user through error
logging that out-of-range data was received.

### Summary (Legacy)

[Summary](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L268)
metric data points convey quantile summaries, e.g. What is the 99-th percentile
latency of my HTTP server. Unlike other point types in OpenTelemetry, Summary
points cannot always be merged in a meaningful way. This point type is not
recommended for new applications and exists for compatibility with other
formats.
metric data points convey quantile summaries, e.g. What is the 99-th
percentile latency of my HTTP server. Unlike other point types in
OpenTelemetry, Summary points cannot always be merged in a meaningful
way. This point type is not recommended for new applications and
exists for compatibility with other formats.

## Exemplars

Expand Down