Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eagerly convert value to float in *MetricFamily.add_metric() #527

Open
mhansen opened this issue Mar 12, 2020 · 7 comments
Open

Eagerly convert value to float in *MetricFamily.add_metric() #527

mhansen opened this issue Mar 12, 2020 · 7 comments

Comments

@mhansen
Copy link

mhansen commented Mar 12, 2020

Hi there, an experience report and improvement suggestion.

I have a pretty simple exporter which queries a backend URL for JSON, grabs the JSON, stuffs it into prometheus metric families, which it then yields.

Sometimes it errors with stacktraces only when the request is finishing, with a stack traces that points at finish_request rather than the code that added the bad float:

TypeError: ("float() argument must be a string or a number, not 'NoneType'", Metric(bom_wind_speed, Wind speed (km/h) from the Bureau of Meterology, gauge, , [Sample(name='bom_wind_speed', labels={'location': 'Sydney Airport'}, value=None, timestamp=None, exemplar=None), Sample(name='bom_wind_speed', labels={'location': 'Sydney - Observatory Hill'}, value=20, timestamp=None, exemplar=None)]))
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/socketserver.py", line 625, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/local/lib/python3.5/socketserver.py", line 354, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python3.5/socketserver.py", line 681, in __init__
    self.handle()
  File "/usr/local/lib/python3.5/http/server.py", line 422, in handle
    self.handle_one_request()
  File "/usr/local/lib/python3.5/http/server.py", line 410, in handle_one_request
    method()
  File "/usr/local/lib/python3.5/site-packages/prometheus_client/exposition.py", line 152, in do_GET
    output = encoder(registry)
  File "/usr/local/lib/python3.5/site-packages/prometheus_client/openmetrics/exposition.py", line 56, in generate_latest
    floatToGoString(s.value),
  File "/usr/local/lib/python3.5/site-packages/prometheus_client/utils.py", line 8, in floatToGoString
    d = float(d)
TypeError: ("float() argument must be a string or a number, not 'NoneType'", Metric(bom_wind_speed, Wind speed (km/h) from the Bureau of Meterology, gauge, , [Sample(name='bom_wind_speed', labels={'location': 'Sydney Airport'}, value=None, timestamp=None, exemplar=None), Sample(name='bom_wind_speed', labels={'location': 'Sydney - Observatory Hill'}, value=20, timestamp=None, exemplar=None)]))

I've hit this kind of bug a few times in different exporters (I guess it's to be expected to get type errors in Python sometimes).

How about eagerly converting the value passed to add_metric to a float? Then the stack trace would point at the exact cause.

This might be a breaking change - that'd be a reasonable reason to reject this. But any code doing this will likely fail soon after, as soon as an attempt is made to serialize the metric, so maybe it'd be worth the change for better debuggability? What do you think?

@brian-brazil
Copy link
Contributor

That sounds reasonable to me. I don't see how it'd be breaking, you're only changing when we're erroring on bad data.

@mhansen
Copy link
Author

mhansen commented Mar 12, 2020 via email

@brian-brazil
Copy link
Contributor

They're useless without being output at some point, and even if it did break someone I'd be okay with that.

@mhansen
Copy link
Author

mhansen commented Mar 19, 2020 via email

@mhansen
Copy link
Author

mhansen commented Mar 19, 2020 via email

@brian-brazil
Copy link
Contributor

The quickest win is probably the add_metric function itself, but wrapping the constructor should also work.

@mhansen
Copy link
Author

mhansen commented Mar 19, 2020 via email

mhansen added a commit to mhansen/client_python that referenced this issue Mar 19, 2020
Addresses prometheus#527.

There's a test setting GaugeHistogramMetricFamily.gsum_value = None, so
I've preserved that behaviour, by not appending and crashing if gsum_value
is None.

I was a bit unsure about this bit:
        assert isinstance(exception.args[-1], core.Metric)

I'm not sure what exception.args[-1] is, the python docs for TypeError and ValueError
don't explain. I've removed the assertion.
mhansen added a commit to mhansen/client_python that referenced this issue Mar 19, 2020
…s#527.There's a test setting GaugeHistogramMetricFamily.gsum_value = None, soI've preserved that behaviour, by not appending and crashing if gsum_valueis None.I was a bit unsure about this bit:        assert isinstance(exception.args[-1], core.Metric)I'm not sure what exception.args[-1] is, the python docs for TypeError and ValueErrordon't explain. I've removed the assertion.
mhansen added a commit to mhansen/client_python that referenced this issue Mar 19, 2020
Addresses prometheus#527.
There's a test setting GaugeHistogramMetricFamily.gsum_value = None,
so I've preserved that behaviour, by not appending and crashing if
gsum_value is None.

I was a bit unsure about this bit:

assert isinstance(exception.args[-1], core.Metric)

I'm not sure what exception.args[-1] is, the python docs for
TypeError and ValueErrordon't explain. I've removed the assertion.

Signed-off-by: Mark Hansen <mark@markhansen.co.nz>
mhansen added a commit to mhansen/client_python that referenced this issue Mar 20, 2020
Addresses prometheus#527.

There's a test setting GaugeHistogramMetricFamily.gsum_value = None,
so I've preserved that behaviour, by not appending and crashing if
gsum_value is None.

I was a bit unsure about this bit:

assert isinstance(exception.args[-1], core.Metric)

I'm not sure what exception.args[-1] is, the python docs for
TypeError and ValueErrordon't explain. I've removed the assertion.

Signed-off-by: Mark Hansen <mark@markhansen.co.nz>
mhansen added a commit to mhansen/client_python that referenced this issue Mar 21, 2020
Addresses prometheus#527.

There's a test setting GaugeHistogramMetricFamily.gsum_value = None,
so I've preserved that behaviour, by not appending and crashing if
gsum_value is None.

I was a bit unsure about this bit:

assert isinstance(exception.args[-1], core.Metric)

I'm not sure what exception.args[-1] is, the python docs for
TypeError and ValueErrordon't explain. I've removed the assertion.

Signed-off-by: Mark Hansen <mark@markhansen.co.nz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants