[WIP] Update prometheus client to 0.6.0, handle counter metric name change #3443
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Like summaries and histogram types have their
_sum/_bucket/_counttimeseries aggregated in a single metric using the trimmed name, versions 0.4.0+ of the python prometheus/openmetrics client trim the_totalsuffix out of counter metric names.This is a breaking change for us, as integration and custom checks should change their configuration to refer to counter metrics with the new shorter metric name.
This PR introduces a compatibility layer to make this transition transparent. When looking up a metric name in one of the scraper_config's map/list, the following happens:
_totalsuffix to the metric name.If we go this route, the prometheus mixin should also be updated, as it uses the same client lib.
Alternative route
We could patch
create_scraper_configurationto scanconfig['label_joins'],config['metrics_mapper'], config['ignore_metrics'] andmetrics_mapperto duplicate all entries containing a_totalsuffix to a new entry without it.This would avoid the double-lookup cost ; but as we do not know if a given metric is a counter at this time, we might create false-positives by matching non-counters with a
_totalsuffix (example: the kubelet's container_network_tcp_usage_total is a gauge)Other changes:
Motivation
What inspired you to submit this pull request?
Additional Notes
Anything else we should know when reviewing?
Review checklist (to be filled by reviewers)
changelog/andintegration/labels attached