Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrometheusDuplicateTimestamps errors with log_to_metrics filter starting in fluent-bit 3.1.5 #9413

Open
reneeckstein opened this issue Sep 23, 2024 · 2 comments

Comments

@reneeckstein
Copy link

Bug Report

Describe the bug
After upgrading from fluent-bit 3.1.4 to 3.1.5 all our k8s clusters start reporting PrometheusDuplicateTimestamps errors
Prometheus metric rate(prometheus_target_scrapes_sample_duplicate_timestamp_total}[5m]) > 0 is increasing.
Prometheus is logging a lot of warnings like this:

ts=2024-09-23T16:22:32.820Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/platform-logging/fluent-bit/1 target=http://10.67.3.197:2021/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=32
ts=2024-09-23T16:22:38.237Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/platform-logging/fluent-bit/1 target=http://10.67.4.81:2021/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=1876
ts=2024-09-23T16:22:39.697Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/platform-logging/fluent-bit/1 target=http://10.67.13.208:2021/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=4
ts=2024-09-23T16:22:41.643Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/platform-logging/fluent-bit/1 target=http://10.67.3.110:2021/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7

To Reproduce

  • Steps to reproduce the problem:
    • Deploy fluent-bit with the below tail input config as a daemonset into a k8s cluster using version 3.1.4 to see container logs and metrics to validate success.
    • Update fluent-bit image to 3.1.5 (or newer, <= 3.1.8) and verify /metrics endpoint on port 2021

Expected behavior
No duplicate metrics on the additional endpoint /metrics for log_to_metrics feature usually on port 2021, no warnings in Prometheus logs, no PrometheusDuplicateTimestamps errors.

Screenshots
image

Your Environment

  • Version used: 3.1.5 (or higher) tested until 3.1.8, issue is still present.
  • Configuration: (Helm chart values)
serviceMonitor:
  enabled: true
  interval: 10s
  scrapeTimeout: 10s
  additionalEndpoints:
  - port: log-metrics
    path: /metrics
    interval: 10s
    scrapeTimeout: 10s

extraPorts:
  - port: 2021
    containerPort: 2021
    protocol: TCP
    name: log-metrics

config:
  service: |
    [SERVICE]
        Flush 1
        Daemon Off
        Log_Level info
        Parsers_File parsers.conf
        Parsers_File custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port {{ .Values.service.port }}

  inputs: |
    [INPUT]
        Name tail
        Tag kube.*
        Alias tail_container_logs
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        DB /var/log/flb_kube.db
        DB.locking true
        Mem_Buf_Limit 32MB
        Skip_Long_Lines On

  filters: |
    [FILTER]
        Name kubernetes
        Alias kubernetes_all
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On
        Annotations Off
        Buffer_Size 1MB
        Use_Kubelet true

    [FILTER]
        name               log_to_metrics
        match              kube.*
        tag                log_counter_metric
        metric_mode        counter
        metric_name        kubernetes_messages
        metric_description This metric counts Kubernetes messages
        kubernetes_mode    true

  outputs: |
    [OUTPUT]
        name               prometheus_exporter
        match              log_counter_metric
        host               0.0.0.0
        port               2021

  • Environment name and version (e.g. Kubernetes? What version?):
    • EKS; Kubernetes 1.30
  • Server type and version:
  • Operating System and version:
    • EKS on Bottlerocket OS 1.22.0 (aws-k8s-1.30) Kernel version 6.1.106 containerd://1.7.20+bottlerocket
  • Filters and plugins:
    • kubernetes, log_to_metrics

Additional context
It is just very annoying when every k8s cluster with this common configuration reports PrometheusDuplicateTimestamps errors

@edsiper
Copy link
Member

edsiper commented Sep 26, 2024

@reneeckstein are you facing the same issue with v3.1.8 ? (we have some fixes in place for a similar problem)

@reneeckstein
Copy link
Author

@edsiper Yes we are facing the same issue in fluent-bit v3.1.8. I'm looking forward for v3.1.9 I noticed to metrics-related commits on master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants