-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add semantic convention for timing durations #1142
Add semantic convention for timing durations #1142
Conversation
|
||
### Value | ||
|
||
The duration of each operation **in seconds** should be recorded onto this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why seconds?
Why not milliseconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm following the precedent set in our system metrics like system.cpu.time.
I see this as a placeholder until a PR lands for standardization of units, which I expect will define a units
label with one of the UCUM standard unit and prefix abbreviations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 @justinfoote
I was hoping that with standardization of units we would not have to specify which timing unit will be used. The semantics are the unchanged by the units, for data carried over OpenTelemetry or any protocol that can preserve units.
Exporters may want to convert time units, of course. Statsd exporters would use milliseconds, conventionally. If the stopwatch measures nanoseconds, it should be OK to record that precision too. @kenfinnigan does this address your concern? If so, could we add a TODO about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding standardization of units correctly, there would be a "units" label that has, for instance, "seconds", and the meter would just have a value of "5"?
For me, that aligns with how Micrometer does units, and I'd be good with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a new label is needed because the unit is already a part of the instrument definition. This could effect the value type (float or int) of the instruments though, should we recommend using an int of the original precision?
If the stopwatch measures nanoseconds, it should be OK to record that precision too.
Like in this case, prefer int nanoseconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. It's part of the instrument, and it's a separate field in the proto.
I would support the idea of allowing int nanoseconds. But I think this is part of the larger can of worms that is #705.
operation being timed, like this: | ||
|
||
``` | ||
{category}.{subcategory}.duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the span name as prefix here?
{category}.{subcategory}.duration | |
{spanname}.duration |
It is supposed to have low cardinality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have low cardinality, and it would support correlation between spans and metrics, which is certainly a goal of mine. But I see two reasons not to use the span name:
First, there may not be a span for the operation being timed here (though really, in this case, I think we should recommend both a span and a metric for an operation whose duration is important).
Second, I don't think the most meaningful aggregation is per-span-name. That is, I'd like to see http.server.duration
to aggregate response time across all of my service's endpoints, and I'd like to be able to facet that graph by the HTTP method and path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you feel about a convention to include span.name
as a label when there is a matching span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, there may not be a span for the operation being timed here
That is not really relevant. You can still use the name that a span tracing that operation would have.
Second, I don't think the most meaningful aggregation is per-span-name.
This is something that was discussed a bit in #557, but I guess is still an open point. However, if you come up with some categorization scheme here, it would certainly also be desirable as span attribute (span.category
?) and warrant broader discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think we agree about the first point, but in practice, an instrumentation writer may find a scenario in which there is no span, and the spec doesn't include any semantics for such a span, but they still want to create a timing metric for the operation. I think the correct approach would be to encourage that instrumentation author to provide semantics for the sort of operation they are timing.
I do indeed think it would be useful to have a span.category
attribute. We have an implied category already, with separate semantic conventions for database
, rpc
, faas
, etc... I think it would be worthwhile to add that explicitly to spans.
...but, I don't want to tackle that discussion in this PR. This is about generic timing metrics. It's intended to be guidance to spec writers about how to write category-specific specs, and guidance to instrumentation authors about how to construct a timing metric for something that isn't explicitly called out in one of our category-specific specs.
How about if I create an issue to have the span.category
discussion separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #1161
|
||
### Value | ||
|
||
The duration of each operation **in seconds** should be recorded onto this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a new label is needed because the unit is already a part of the instrument definition. This could effect the value type (float or int) of the instruments though, should we recommend using an int of the original precision?
If the stopwatch measures nanoseconds, it should be OK to record that precision too.
Like in this case, prefer int nanoseconds?
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
operation being timed, like this: | ||
|
||
``` | ||
{category}.{subcategory}.duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a situation where the subcategory
would be omitted? Should it be optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. I'd like to refer any remaining questions over units to #1177.
There are two remaining questions associated with the use of "seconds" here that are partly addressed in #1177:
|
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
@justinfoote and @jmacd, would one of you be able to reopen this PR? I don't think it should have been closed as inactive. 🤖 |
Fixes #1035
Changes
This PR adds semantic conventions for metrics that represent how long an operation takes to run. It includes guidance about how to incorporate tracing semantic conventions, but only in the case that the operation is already described there.
In addition to guiding the creation of metric instruments, I'm hoping that this is sort of the base class for all of the other duration metric semantic conventions, and it can serve as a guide to future spec writers about our general conventions around this sort of metric.
Question: Does this warrant a changelog update?
Related oteps #129