Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics: Make duplicate instrument registration invalid #1046

Closed
jmacd opened this issue Oct 1, 2020 · 20 comments · Fixed by #1557
Closed

Metrics: Make duplicate instrument registration invalid #1046

jmacd opened this issue Oct 1, 2020 · 20 comments · Fixed by #1557
Assignees
Labels
area:sdk Related to the SDK priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory

Comments

@jmacd
Copy link
Contributor

jmacd commented Oct 1, 2020

What are you trying to achieve?

The OpenCensus Metric API made statements about when a duplicately-named instrument could be registered. This needs to be reconsidered in today's OTel API and SDK specification. Since we have added the concept of Named Meters, I believe we can remove any treatment of duplicate registration and make duplicate registration invalid behavior.

See this section in the current SDK specification: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/metrics/sdk.md#sdk-instrument-registration

This states that under no conditions can the same instrument name be registered in the same Named Meter. The assumption is that if you are a single instrumentation library, then you are able to arrange to construct each metric instrument that you will use once. If you are expecting to create a metric with the same name, you should use a different Named Meter.

It never made sense to support duplicate registration for asynchronous instruments, since it would imply some conflict in the number of callbacks. For synchronous instruments it we can devise rules to allow duplicate registration, but it's not clear that this is useful to the user.

Since this is already written into the current specification, we can close this issue if people agree.

@jkwatson
Copy link
Contributor

jkwatson commented Oct 1, 2020

So, in the Java implementation at the moment, if you ask for the same instrument a second time, we just return the same instance, rather than "rejecting" it. Would that be an acceptable implementation decision?

@kenfinnigan
Copy link
Member

+1 @jkwatson, this is the behavior of Micrometer as well.

It's a convenient approach to removing the need to retain instances at various places, while also preventing duplicate registration.

@jmacd
Copy link
Contributor Author

jmacd commented Oct 1, 2020

Question: @jkwatson under your scheme, are two descriptors with everything matching except with different descriptions the same or not?

Question: @kenfinnigan I'm trying to understand how this can happen. Aren't the "various places" different instrumentation libraries? I can imagine surprising results if two independent pieces of code get the same metric instrument and start using it, unaware of the other.

@kenfinnigan
Copy link
Member

@jmacd By "various places" I meant both framework code, or application code across different classes, etc.

It's very convenient to not hold onto an instance of a Metric and call the provider every time you need to access it.

As long as the name and tags/labels match, the existing metric is returned. If the name matches an existing metric, but the tags/labels don't, that would be a registration error.

@jkwatson
Copy link
Contributor

jkwatson commented Oct 1, 2020

Question: @jkwatson under your scheme, are two descriptors with everything matching except with different descriptions the same or not?

The registration is done by the lower-cased name and nothing else. If you ask for another instrument with the same name, but a different identity (name, description, type), then you do get an exception from the SDK. (type here is counter, valuerecorder, etc plus the type of values [double/long]).

@jmacd
Copy link
Contributor Author

jmacd commented Oct 2, 2020

but a different identity (name, description, type)

Yes, this is how the OTel-Go code works, for example, but I realize we have to include the units in this "identity". Should we also specify the behavior when multiple descriptions are provided?

I assume you mean to specify this behavior only for synchronous instruments (Counter, UpDownCounter, ValueRecorder). Would you agree that any duplicate Observer instrument yields a registration error?

It's these two cases that bother me about this behavior--that we have to address the question of descriptions not matching and that we have to explain why the same is not true for asynchronous instruments.

(@jkwatson I've been meaning to ask about the case-insensitive aspect you mentioned, is that somewhere in the API specification? I would expect to see it there. I'm not saying I disagree, but it's news to me.)

@jmacd
Copy link
Contributor Author

jmacd commented Oct 2, 2020

It's very convenient to not hold onto an instance of a Metric and call the provider every time you need to access it.

I wonder if we need to discuss the concurrency expectations around this kind of access. It's one thing to have a registry that has to be consulted once per instrument constructor when code is first initialized--for this I would not worry about a simple mutex. If this registry is going to be read every time an instrument is used, I would not expect a simple mutex to be good enough. We have not stated any requirements of this nature.

Related note: I've prepared a PR adding what I call "transient descriptor" support to the OTel-Go SDK. This is a sync.Map used to implement a concurrent registry for unique descriptors, but there the goal is to allow different descriptors to pass through, meaning that two nearly-identical instruments but with different description field will be treated as separate instruments: jmacd/opentelemetry-go#59

This is how a Prometheus server behaves: instruments with the same name and type but different HELP descriptions are treated as separate instruments.

@andrewhsu andrewhsu added the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Oct 2, 2020
@andrewhsu
Copy link
Member

andrewhsu commented Oct 2, 2020

from the issue triage mtg today, marking initially as required for ga, @jmacd if you think this needs adjustment, feel free to update

@andrewhsu andrewhsu added area:api Cross language API specification issue priority:p1 Highest priority level area:sdk Related to the SDK and removed area:api Cross language API specification issue labels Oct 2, 2020
@andrewhsu
Copy link
Member

initially assigning to @jmacd but feel free to find more suitable if needed

@jkwatson
Copy link
Contributor

jkwatson commented Oct 2, 2020

I assume you mean to specify this behavior only for synchronous instruments (Counter, UpDownCounter, ValueRecorder). Would you agree that any duplicate Observer instrument yields a registration error?

No, it behaves the same, either way. Why would it be different for Observers?

(@jkwatson I've been meaning to ask about the case-insensitive aspect you mentioned, is that somewhere in the API specification? I would expect to see it there. I'm not saying I disagree, but it's news to me.)

Item #2 in the list here: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/metrics/api.md#instrument-naming-requirements

@jmacd
Copy link
Contributor Author

jmacd commented Oct 2, 2020

Would you agree that any duplicate Observer instrument yields a registration error?

No, it behaves the same, either way. Why would it be different for Observers?

Because there's a callback -- when you register an instrument the second time, are we going to require that the callback is identical? (this is perhaps tricky to define)

If not, I worry that you'll allow mulitple callbacks to be used which will lead to multiple measurements per observer per interval, but the API states that each observer instrument may produce at most one value per label set. I can't see how to support multiple callbacks, and practically speaking I don't think we should try to talk about callback equality. For this reason, I'd like to state that you can only register observers once.

@kenfinnigan
Copy link
Member

Couldn't it be the combination of "name" and "labels" that defines a registration, even for Observers?

@jmacd
Copy link
Contributor Author

jmacd commented Oct 2, 2020

When constructing an observer, there's a side effect: you're supposed to register a callback. There's no "use" in holding the instrument, so the fact that the same instance is returned doesn't matter. Do we overwrite the callback or do we return an error? I think those are the only options, and I prefer the error case.

@jkwatson
Copy link
Contributor

jkwatson commented Oct 2, 2020

When constructing an observer, there's a side effect: you're supposed to register a callback. There's no "use" in holding the instrument, so the fact that the same instance is returned doesn't matter. Do we overwrite the callback or do we return an error? I think those are the only options, and I prefer the error case.

In Java, currently, when you have a handle to an AsynchronousInstrument, you can set the callback on it. If there's one there already, we replace it.

@anuraaga
Copy link
Contributor

anuraaga commented Oct 3, 2020

If I'm not mistaken, there is a big difference between micrometer and here in that in micrometer meters have tags defined, and here instruments don't have labels and those are provided when recording. Correct me if I'm wrong in which case can ignore the rest of this :)

I generally have to query meter at recording side when using tags with micrometer, but if our API doesn't work like that I guess it's reasonable to expect most users to store the instrument when creating? In which case I probably wouldn't worry that much about concurrent access performance.

But I probably wouldn't flat out forbid it - there are probably situations where not storing is convenient for whatever reason. Also, I have used an internal framework before that did forbid duplicate meters with a flag to allow turning off the exception. Pretty much everyone turned off the exception - when your code base grows, you will find two framework libraries to register the same meter in static initialization and maybe not even use them. Having that stop server startup is not fun - I'd generally like the framework to best-effort provide me some metrics in this case rather than chasing down those framework authors in whatever other team they're on. I'd probably try chasing them down since there is likely some problem with the metrics but don't want it to be a blocking issue.

@kenfinnigan
Copy link
Member

@anuraaga there used to be the ability in OTeL to set "constant labels", which would be equivalent to Micrometer behavior. That behavior looks like it was removed recently, and I'm not sure the reasoning.

@jkwatson
Copy link
Contributor

jkwatson commented Oct 5, 2020

@anuraaga there used to be the ability in OTeL to set "constant labels", which would be equivalent to Micrometer behavior. That behavior looks like it was removed recently, and I'm not sure the reasoning.

IIRC, the reasoning was to keep the API very limited initially and add these concepts back in as needed. There had been some confusion, I think, about how "constant labels", bound-instrument usage, and non-bound-instrument usage interacted. If you want the effect of "constant labels", you can use the bound-instrument calling convention to get it, so I think it was seen as a bit redundant for the API initially.

@kenfinnigan
Copy link
Member

Thanks for the information @jkwatson. I don't think it will impact the way I'm approaching some things, but I will be preferring the micrometer approach of not failing registration but returning the same meter.

@srikanthccv
Copy link
Member

Hello @jmacd @jkwatson ,

Specification says

The OpenTelemetry SDK is responsible for ensuring that an individual Meter implementation cannot report multiple instruments with the same name and different definitions. To accomplish this, SDKs MUST reject duplicate registration of an instrument when another instrument has already been registered with same metric name to the same named Meter.

What does the term definition here mean?. And it also says

The requirement applies even for attempts to register an identical instrument definition.

Does this mean regardless of definition sdk must reject registration if another instrument has been registered with same name?

Should SDK reject the duplicate registrations or return the existing instrument? There is an issue in OTel-Python says meter implementation must return error. As this is discussion is still in progress there is some uncertainty whether sdk should throw error or return existing instrument. Did any further discussion happen about this in SIG mettings? Will there be chage in spec about dealing with duplicate registrations?

@jkwatson
Copy link
Contributor

jkwatson commented Dec 2, 2020

We haven't discussed this any further at this point. It seems like a good topic for the next SIG meeting, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants