Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash of otel collector when try to use __address__ target in prometheus receiver with the help of pod port annotation. #34230

Closed
lazyboson opened this issue Jul 23, 2024 · 2 comments
Labels
bug Something isn't working needs triage New item requiring triage receiver/prometheus Prometheus receiver

Comments

@lazyboson
Copy link

Component(s)

receiver/prometheus

What happened?

Description

I am trying to deploy otel collector in k8s cluster (version 1.26). I am using helm chart to deploy it. I am scraping prometheus matrics from pods which have following annotations -

annotations: 
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8080'
        prometheus.io/path: '/metrics'

I am using collector config as referenced in configuration section. Collector is crashing.

Expected Result

Collector should have been up and running.

Actual Result

Otel collector crash.

Collector version

0.104.0

Environment information

Environment

OS: AWS Linux(ubuntu)
Compiler: N/A

OpenTelemetry Collector configuration

config:
  receivers:
    otlp:
      protocols:
        http:
    fluentforward:
      endpoint: 0.0.0.0:8006
    prometheus:
      config:
        scrape_configs:
          - job_name: 'otel-node-exporter'
            scrape_interval: 20s
            honor_labels: true
            static_configs:
              - targets: ['${K8S_NODE_IP}:9100']
          - job_name: 'kubernetes-pods'
            scrape_interval: 20s
            kubernetes_sd_configs:
              - role: pod
            relabel_configs:
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                action: keep
                regex: true
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                action: replace
                target_label: __metrics_path__
                regex: (.+)
              - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                action: replace
                regex: ([^:]+)(?::\d+)?;(\d+)
                replacement: ${1}:${2}
                target_label: __address__
              - action: labelmap
                regex: __meta_kubernetes_pod_label_(.+)
              - source_labels: [__meta_kubernetes_namespace]
                action: replace
                target_label: kubernetes_namespace
              - source_labels: [__meta_kubernetes_pod_name]
                action: replace
                target_label: kubernetes_pod_name
  exporters:
    otlphttp:
      endpoint: https://ABC@qry.gigapipe.io
      timeout: 30s
      compression: none
      encoding: proto
    loki:
      endpoint: https://ABC@qryn.gigapipe.io/loki/api/v1/push
      timeout: 30s
    prometheusremotewrite:
      endpoint: https://ABC@qryn.gigapipe.io/prom/remote/write
      timeout: 30s
  processors:
    attributes:
      actions:
        - action: insert
          key: loki.attribute.labels
          value: sender
    memory_limiter:
      check_interval: 1s
      limit_mib: 4000
      spike_limit_mib: 800
    batch:
      send_batch_max_size: 10000
      timeout: 20s
  connectors:
    servicegraph:
      latency_histogram_buckets: [ 100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms ]
      dimensions: [ cluster, namespace ]
      store:
        ttl: 2s
        max_items: 1000
      cache_loop: 2m
      store_expiration_loop: 2s
      virtual_node_peer_attributes:
        - db.name
        - rpc.service
    spanmetrics:
      namespace: traces.spanmetrics
      exemplars:
        enabled: false
      dimensions_cache_size: 1000
      aggregation_temporality: 'AGGREGATION_TEMPORALITY_CUMULATIVE'
      metrics_flush_interval: 30s
      events:
        enabled: false
  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [memory_limiter, batch]
        exporters: [otlphttp, spanmetrics, servicegraph]
      logs:
        receivers: [loki, fluentforward]
        processors: [batch]
        exporters: [loki]
      metrics:
        receivers: [prometheus, spanmetrics, servicegraph]
        processors: [batch]
        exporters: [prometheusremotewrite]

Log output

kubectl logs otelcollector-opentelemetry-collector-agent-22gd9 -n otel
Error: failed to resolve config: cannot resolve the configuration: environment variable "1" has invalid name: must match regex ^[a-zA-Z_][a-zA-Z0-9_]*$
2024/07/23 17:52:48 collector server run finished with error: failed to resolve config: cannot resolve the configuration: environment variable "1" has invalid name: must match regex ^[a-zA-Z_][a-zA-Z0-9_]*

Additional context

No response

@lazyboson lazyboson added bug Something isn't working needs triage New item requiring triage labels Jul 23, 2024
@github-actions github-actions bot added the receiver/prometheus Prometheus receiver label Jul 23, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@lazyboson lazyboson changed the title Crash of otel collector when using making __address__ in prometheus receiver from pod port annotations. Crash of otel collector when try to use __address__ target in prometheus receiver with the help of pod port annotation. Jul 23, 2024
@lazyboson
Copy link
Author

go it working by changing - ${1} to $${1} in version 0.105.0.

 - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                action: replace
                regex: ([^:]+)(?::\d+)?;(\d+)
                replacement: $${1}:$${2}
                target_label: __address__
                
    ```
    
    I didn't understand this change. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New item requiring triage receiver/prometheus Prometheus receiver
Projects
None yet
Development

No branches or pull requests

1 participant