[receiver/prometheus] honor_labels set to true and scraping a prometheus pushgateway not working #33742

paebersold-tyro · 2024-06-25T01:45:25Z

Component(s)

receiver/prometheus

What happened?

Description

Scraping a Prometheus pushgateway with honor_labels: true results in a scrape endpoint failure. Suspect this is due to the scrape metrics having both instance and jobs labels (from #15239) but would like clarification that this is the problem. Also is there any work around (other than setting honor_labels: false). Attempted doing a label drop with metric_relabel_config but that did not work.

Steps to Reproduce

Prometheus receiver config

          - job_name: test-pushgateway
            scrape_interval: 30s
            scrape_timeout: 10s
            honor_labels: true
            scheme: http
            kubernetes_sd_configs:
            - role: pod
              namespaces:
                names:
                - app-platform-monitoring
            relabel_configs:
            # and pod is running
            - source_labels: [__meta_kubernetes_pod_phase]
              regex: Running
              action: keep
            # and pod is ready
            - source_labels: [__meta_kubernetes_pod_ready]
              regex: true
              action: keep
            # and only metrics endpoints
            - source_labels: [__meta_kubernetes_pod_container_port_name]
              action: keep
              regex: metrics

Expected Result

Endpoint is scraped, job and instances labels from the pushgateway are used.

Actual Result

Endpoint scrape failure (see log message below)

Collector version

0.102.0

Environment information

Environment

OS: Kubernetes 1.29

OpenTelemetry Collector configuration

receiver:
    prometheus:
      config:
          - job_name: test-pushgateway
            scrape_interval: 30s
            scrape_timeout: 10s
            honor_labels: true
            scheme: http
            kubernetes_sd_configs:
            - role: pod
              namespaces:
                names:
                - app-platform-monitoring
            relabel_configs:
            # and pod is running
            - source_labels: [__meta_kubernetes_pod_phase]
              regex: Running
              action: keep
            # and pod is ready
            - source_labels: [__meta_kubernetes_pod_ready]
              regex: true
              action: keep
            # and only metrics endpoints
            - source_labels: [__meta_kubernetes_pod_container_port_name]
              action: keep
              regex: metrics
exporter:
  debug: {}
service:
  pipeline:
    metrics:
      exporters: [debug]
      processors: []
      receivers: [prometheus]

Log output

2024-06-24T06:20:36.193Z        warn    internal/transaction.go:125     Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1719210036190, "target_labels": "{__name__=\"up\", instance=\"10.18.67.171:9091\", job=\"test-pushgateway\"}"}

Additional context

sample of metrics that are returned from the pushgateway

app_platform_attestation{feature="coredns",instance="",job="cluster",team="bob",test="TestCoreDNSNameResolution"} 1
app_platform_attestation{feature="coredns",instance="",job="cluster",team="bob",test="TestIsCoreDNSDeployed"} 1
app_platform_attestation{feature="coredns",instance="",job="cluster",team="bob",test="TestIsCoreDNSServiceAvailable"} 1
push_failure_time_seconds{feature="coredns",instance="",job="cluster"} 0
push_time_seconds{feature="coredns",instance="",job="cluster"} 1.7192055849949868e+09

The text was updated successfully, but these errors were encountered:

github-actions · 2024-06-25T01:45:41Z

Pinging code owners:

receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole · 2024-06-25T13:24:59Z

Can you set the log level of the collector to debug to see the detailed error message for why the scrape failed?

dashpole · 2024-06-25T13:26:34Z

I think it should be:

service:
    logs:
        level: DEBUG

paebersold-tyro · 2024-06-27T01:45:32Z

Hello, debug log output (seems the empty instance label may be the issue as suspected)

2024-06-27T01:40:49.045Z	debug	scrape/scrape.go:1650	Unexpected error	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_pool": "test-pushgateway", "target": "http://10.18.67.95:9091/metrics", "series": "app_platform_attestation{feature=\"coredns\",instance=\"\",job=\"cluster\",team=\"bob\",test=\"TestCoreDNSNameResolution\"}", "error": "job or instance cannot be found from labels"}
2024-06-27T01:40:49.045Z	debug	scrape/scrape.go:1346	Append failed	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_pool": "test-pushgateway", "target": "http://10.18.67.95:9091/metrics", "error": "job or instance cannot be found from labels"}
2024-06-27T01:40:49.045Z	warn	internal/transaction.go:125	Failed to scrape Prometheus endpoint	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1719452449041, "target_labels": "{__name__=\"up\", instance=\"10.18.67.95:9091\", job=\"test-pushgateway\"}"}

dashpole · 2024-06-27T11:40:32Z

This should've been fixed by #33565. Can you try upgrading to v0.103.0?

paebersold-tyro · 2024-06-28T04:43:51Z

Thank you for that, 0.103.0 fixed the issue.

paebersold-tyro added bug Something isn't working needs triage New item requiring triage labels Jun 25, 2024

github-actions bot added the receiver/prometheus Prometheus receiver label Jun 25, 2024

paebersold-tyro closed this as completed Jun 28, 2024

github-actions bot mentioned this issue Jul 2, 2024

Weekly Report: 2024-06-25 - 2024-07-02 #33839

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/prometheus] honor_labels set to true and scraping a prometheus pushgateway not working #33742

[receiver/prometheus] honor_labels set to true and scraping a prometheus pushgateway not working #33742

paebersold-tyro commented Jun 25, 2024

github-actions bot commented Jun 25, 2024

dashpole commented Jun 25, 2024

dashpole commented Jun 25, 2024

paebersold-tyro commented Jun 27, 2024

dashpole commented Jun 27, 2024

paebersold-tyro commented Jun 28, 2024