-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Describe the bug
If the __name__
label is changed using prometheus relabel configs, the prometheus receiver fails to populate metric metadata.
The root cause is that the prometheus receiver looks up metric metadata using the final metric name. However, the prometheus server updates the metadata cache before applying relabel rules, meaning the metadata cache stores metadata based on the initial metric name. The metadata lookup fails, and it is left empty.
Steps to reproduce
Run a collector with the config below.
The collector produces the logs:
2020-12-16T21:04:44.840Z info internal/metrics_adjuster.go:357 Adjust - skipping unexpected point {"component_kind": "receiver", "component_type": "prometheus", "component_name": "prometheus", "type": "UNSPECIFIED"}
2020-12-16T21:04:45.035Z INFO loggingexporter/logging_exporter.go:361 MetricsExporter {"#metrics": 1}
2020-12-16T21:04:45.035Z DEBUG loggingexporter/logging_exporter.go:388 ResourceMetrics #0
Resource labels:
-> service.name: STRING(prometheusreceiver)
-> host.name: STRING(localhost)
-> port: STRING(8888)
-> scheme: STRING(http)
InstrumentationLibraryMetrics #0
InstrumentationLibrary
Metric #0
Descriptor:
-> Name:
-> Description:
-> Unit:
-> DataType: None
Note the empty Name, Description, Unit, and DataType.
What did you expect to see?
I expected metrics to be renamed, and otherwise emitted normally.
What did you see instead?
Metrics are dropped during conversion from OpenCensus format to OpenTelemetry format because the metric descriptor type is Unknown.
What version did you use?
Version: otel/opentelemetry-collector-contrib-dev@sha256:0dbc61590cc04678997173fb378c875e2733ff2e443d75a7957a340d4b2bb9ef (latest)
What config did you use?
Config: (e.g. the yaml config file)
receivers:
prometheus:
config:
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'prometheusreceiver'
static_configs:
- targets: [localhost:8888]
metric_relabel_configs:
# filter out all metrics except otelcol_process_cpu_seconds so it is easier to read logs
- source_labels: [ __name__ ]
regex: "otelcol_process_cpu_seconds"
action: keep
# rename otelcol_process_cpu_seconds to process_cpu_seconds by replacing the __name__ label
- source_labels: [ __name__ ]
regex: "otelcol_(.*)"
action: replace
target_label: __name__
exporters:
logging:
logLevel: debug
service:
pipelines:
metrics:
receivers: [prometheus]
exporters: [logging]
Additional context
I was able to determine the root cause above by adding additional debug statements and rebuilding the collector.