Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex wildcards via target allocator not matching #3051

Closed
fredrikgh opened this issue Jun 18, 2024 · 1 comment
Closed

regex wildcards via target allocator not matching #3051

fredrikgh opened this issue Jun 18, 2024 · 1 comment
Labels
bug Something isn't working needs triage

Comments

@fredrikgh
Copy link

Component(s)

collector, target allocator

What happened?

Description

There appears to be discrepancy between how the otel-collector eventually filter Prometheus metrics based on a ServiceMonitor via Target allocator (TA), compared to a vanilla Prometheus instance. Truthfully, I don't know if this issue belongs here or in the collector, more in additional context below.

Steps to Reproduce

  1. Set up a workload to monitor, in my case Mimir.
  2. Set up a simple ServiceMonitor, and specify a drop action using wildcards e.g:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mimir-service-monitor
spec:
  endpoints:
    - interval: 30s
      port: http
      path: /metrics
      metricRelabelings:
        - action: drop
          sourceLabels: [__name__]
          regex: .*bucket.*
  selector:
    matchLabels:
      app.kubernetes.io/name: mimir
  1. Set up two Prometheus instances:
    A: being written to by an otel-collector fetching targets from above ServiceMonitor using TA.
    B: fetching targets from the ServiceMonitor using the serviceMonitorSelector field of the Prometheus CRD, disabled remoteWriteReceiver.
  2. Enter the Prometheus UI of both, and search for cortex_bucket.
  3. Compare results.

Expected Result

image

This is the result I'm getting from B, i.e. no metrics containing bucket were fetched.

Actual Result

image

This is the result I'm getting from A, i.e. all metrics were fetched.

Kubernetes Version

v1.29.2

Operator version

v0.102.0

Collector version

v0.102.1

Environment information

Environment

OS: (e.g., "Ubuntu 22.04")

Log output

No response

Additional context

Some additional notes:

  1. TA appears to respond with the correct values on /scrape_configs, so it may be that any error happens after that point.

image

  1. We see other RegEx wildcards working, which is especially odd. E.g. this does correctly keep kube_daemonset.* metrics:

image

@fredrikgh fredrikgh added bug Something isn't working needs triage labels Jun 18, 2024
@fredrikgh
Copy link
Author

Okay, this was quite elusive but this comes down to what Prometheus and OTel considers a metric.

In Prometheus, you can filter the metric cortex_query_frontend_retries_bucket based on that name.

In OTel, cortex_query_frontend_retries_bucket is a datapoint in metric cortex_query_frontend_retries. This explains why bucket, specifically, didn't work. Found this by reading this thread. This is pretty flawed, since there's no reliable way of removing these metrics (although you can filter by type in OTel itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

No branches or pull requests

1 participant