You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AWS EKS cluster has OpenTelemetry collector deployed as DeamonSet and uses TargetAllocator to discover metrics endpoints from ServiceMonitors.
Shortened list of metrics I'm trying to ingest:
# HELP tekton_pipelines_controller_pipelinerun_duration_seconds The pipelinerun execution time in seconds
# TYPE tekton_pipelines_controller_pipelinerun_duration_seconds histogram
tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",le="43200"} 1
tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",le="86400"} 1
tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",le="+Inf"} 1
tekton_pipelines_controller_pipelinerun_duration_seconds_sum{namespace="tekton-verification",pipeline="tekton-verification",status="success"} 13.087762487
tekton_pipelines_controller_pipelinerun_duration_seconds_count{namespace="tekton-verification",pipeline="tekton-verification",status="success"} 1
# HELP tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds The pipelinerun's taskrun execution time in seconds
# TYPE tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds histogram
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",task="anonymous",le="43200"} 1
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",task="anonymous",le="86400"} 1
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{namespace="tekton-verification",pipeline="tekton-verification",status="success",task="anonymous",le="+Inf"} 1
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum{namespace="tekton-verification",pipeline="tekton-verification",status="success",task="anonymous"} 13.06821713
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_count{namespace="tekton-verification",pipeline="tekton-verification",status="success",task="anonymous"} 1
Service Monitor configuration used to whitelist specific metrics:
In Otel agent configuration there are no additional filters - it's just a direct passthrough from TargetAllocator created scrape_configs.
Expected Result
Both tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum and tekton_pipelines_controller_pipelinerun_duration_seconds_sum metrics are ingested and other metrics are discarded
Actual Result
tekton_pipelines_controller_pipelinerun_duration_seconds_sum - ingested tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum - not
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_(.*), - then tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum, _bucket, and _count are ingested
When trying to use the wildcard to allow ingest a bit more and then additionally drop _bucket, and _count metrics - this also doesn't work.
It may be related to the length of the metric name and adding a wildcard at the end allows the metric to be ingested. Additional observation that none of the metrics longer than tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum can be whitelisted without wildcard at the end.
First, note that you have "keep" for the action, so the other series should be discarded, and the ones that match the regex will be kept.
These are histogram metrics, so the resulting metric should be named tekton_pipelines_controller_pipelinerun_duration_seconds or tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds
If you only keep the _sum series, the collector may drop your histogram entirely, as it won't be a valid Histogram. I haven't tested it, but you might get strange behavior doing this.
It shouldn't have anything to do with the length of the regex or the length of the metric.
I figured out what was different in my regex between the two histogram metrics and noticed that for tekton_pipelines_controller_pipelinerun_duration_seconds I whitelisted both _sum and _count, when I tried the same for tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds - then it worked.
So to sum it up - to ingest histogram metric you need to whitelist _sum and _count series, otherwise the metric will be rejected.
For the case I was trying to resolve - this helped, because _bucket metric was the problem since it produced lots of data that was not used.
Component(s)
receiver/prometheus
What happened?
Description
AWS EKS cluster has OpenTelemetry collector deployed as DeamonSet and uses TargetAllocator to discover metrics endpoints from ServiceMonitors.
Shortened list of metrics I'm trying to ingest:
Service Monitor configuration used to whitelist specific metrics:
In Otel agent configuration there are no additional filters - it's just a direct passthrough from TargetAllocator created scrape_configs.
Expected Result
Both tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum and tekton_pipelines_controller_pipelinerun_duration_seconds_sum metrics are ingested and other metrics are discarded
Actual Result
tekton_pipelines_controller_pipelinerun_duration_seconds_sum - ingested tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum - not
tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_(.*), - then tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum, _bucket, and _count are ingested
When trying to use the wildcard to allow ingest a bit more and then additionally drop _bucket, and _count metrics - this also doesn't work.
It may be related to the length of the metric name and adding a wildcard at the end allows the metric to be ingested. Additional observation that none of the metrics longer than tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_sum can be whitelisted without wildcard at the end.
Collector version
otel/opentelemetry-collector-contrib:0.96.0
Environment information
Environment
Cloud, AWS EKS DeamonSet
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: