-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implicit behavior for Counter metric #15
Comments
@connorjacobsen You are right, two metrics will be pushed when that event is fired, but this is intentional. The library explicitly says that for counter metrics, the measurement is discarded (somewhere here under the "Counter" section). In your example, two metrics are registered, and both are published when the corresponding event is emitted. What is the issue that you're facing? Do you think that this behaviour should be documented better? I understand that it might be confusing. |
I think the tricky thing is the example given in the The discarding the measurement makes sense, but initially it felt like it meant "it doesnt matter what value you use, it will be replaced with Does that make sense? |
Yes, I see how that might be a problem. @bryannaegele does this issue also apply to Prometheus reporter? I'm curious what's your opinion about it. |
That's a good question. Because of the way Prometheus works I keep references to everything and they're all registered to ETS in some fashion at some point and permanently tracked for aggregation, so the names must be unique. If you attempt to register 2 metrics with the same name, the first one wins and the rest are rejected with an error. Registering two metrics definitions with different names listening on the same event is perfectly fine and should be handled correctly. I don't know the behavior of trying that across metric types. I'll have to test that, but that should be possible. For ref, the Prometheus reporter does discard the measurement and increments by 1 in the case of a counter definition. In the case of a sum definition, it does take the value of the specified measurement and adds it. Under the hood, they're both considered a counter in Prometheus. |
@arkgil One thing I do slightly differently is in the registration of the handler id. In the example given by @connorjacobsen, the handler id based on your implementation would be
In my implementation, I use the metric definitions full name which was provided rather than the parsed event name. I'm pretty tired, so I might be missing something, but changing the handler id generation should prevent this from occurring. |
@bryannaegele in StatsD case, I group metrics by event name, so there is always a single handler for each unique event under the reporter. But thank you, it looks like Prometheus follows the same appraoch, i.e. no matter if measurement is there or not, the counter is updated. I'm wondering what can be done to make it less confusing - it is mentioned in the docs, so maybe they could use some rewording. What do you think @connorjacobsen ? |
Some rewording of the docs would probably be sufficient for now? |
I recently noticed that the reporter throws away the
measurement
field of Counter metrics. This can lead to some surprising outcomes.As an illustration, consider the following metric definitions:
Based on my understanding of the current implementation of the
handle_event/4
function, the following code would trigger a metric to be published for both of the above metrics (one of which would have the incorrect tags and, I think, may actually cause the handler to detach):This can be avoid if you explicitly pass the
event_name
option, or if you ensure that the defaultevent_name
of your metric is unique (ie using thecount
suffix on everything would do the trick).I am struggling to quickly reproduce this in the tests for this project (without explicitly specifying event name), but in that case it's simply not publishing anything so it may be a separate problem. I can revisit and try to break things properly over the weekend.
But the offending code seems to be: https://github.com/arkgil/telemetry_metrics_statsd/blob/0f79e2af612caa834ac522da2f6209cd930f5967/lib/telemetry_metrics_statsd/event_handler.ex#L88-L91
The counter here will loop through every metric with the shared prefix and publish a payload without ensuring that the metric in question matches the measurement of the fired event. Contrast this with the
Telemetry.Metrics.ConsoleReporter
module which will returnnil
in itsfetch_measurement/2
equivalent and not publish if theevent_name
is the same but themeasurement
is different: https://github.com/beam-telemetry/telemetry_metrics/blob/v0.3.0/lib/telemetry_metrics/console_reporter.ex#L98This behavior can be mitigated by post-fixing all counter metrics with
.count
, but that is not explicitly mentioned in the documentation (that I have been able to find).If there's something going on that I'm just completely missing, I would be very happy to be wrong here 😄
PS. If you know how to get tests to work without explicitly specifying the
event_name
option togiven_counter
, it would help me reproduce this with a test faster. My initial discovery was from a test reporter module I wrote by copy-pasting the bones of this reporter. So while I am not certain this bug exists, I'm very suspicious.The text was updated successfully, but these errors were encountered: