-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics: Add "Instantaneous" Temporality (Gauge Histogram) #274
Comments
It is worth making a detailed comparison to the Stackdriver protocol, which is both similar and different: https://cloud.google.com/monitoring/api/v3/kinds-and-types The table looks like: At a high-level, I would explain the differences between these protocols as follows. There are two coordinates in these tables, describing information the OTel group has referred to as "structure" and the "temporality". The OTel protocol has arranged this table so that all first-class semantic information is carried in the data point kind. Temporality is strictly second-class information about how data was collected or encoded, and does not change interpretation. The use of temporality is to allow for flexibility in collection, not to describe semantics, in the OpenTelemetry model. The Stackdriver protocol has arranged this table in a more compact form, allowing Gauges and Counters to share a Value Type but be distinguished by their conceptual temporality ("Metric Kind"). The Stackdriver model therefore includes semantic information in both dimensions of this table, whereas OTel has created more rows of table in order to have a single semantic dimension. |
I've had a chance to think this over, and I think it's showing some oddities in OTel's fragmenetation of the world. Specifcially, let's look at the goal of having "natural aggregation" methods for the data types.
To the extent that "Gauage is already instantaneous temporality" I think it makes more sense to fit a histogram as a point type within Gauge. If I read the OpenMetrics Definition, it feels more clear that there are two different things going on here:
This leads me to a lot of questions:
LONG winded response later, I don't think Instantaneous Temporality solves the semantics here, and it just makes dealing with Histograms a bit more odd. It's possible there's a lot I'm not seeing, so wanted to kick off the discussion. |
I've reconsidered how to represent GaugeHistogram-type metric data. Instantaneous is not a good concept for OTel metrics. |
PR #236 calls for a GaugeHistogram, which is a form of instantaneous histogram. This histogram can logically be calculated in OTel's metric API model through the use of
ValueObserver
instruments and a label that is removed by the SDK. The result is a histogram of gauge values.The concept of Temporality can be extended to include a third kind of temporality called Instantaneous, which will allow us to encode a series of GaugeHistogram values from Prometheus or to aggregate OTel Gauge points into histograms in an export pipeline.
Both OTLP Sum and Histogram points include an
aggregation_temporality
field, so adding a new value of Temporality means defining what it means in both cases. The meaning for Histogram points of Instantaneous temporality is precisely what GaugeHistogram means (see #236 (comment)).The meaning for Sum points with Instantaneous temporality is not a requirement for OTel Metrics. However, it is easy to define and reasonable to consider, because an "Instant Sum" is not different than a Sum with Delta temporality and an infinitesimally small window of time. We can simply define Instant Sum points to be deltas with a zero-width time range.
The only problem left by the definitions above is that we now have a way to encode a raw, scalar-valued point corresponding to the API-level metric event that generates a Counter or a Gauge point. A raw Counter event translates into an Instant Sum. A raw Gauge event translates into a Gauge point (sort of), but we would have to use Histogram exemplars to encode a raw histogram value--at the very least this feels asymmetric. See #188.
The text was updated successfully, but these errors were encountered: