Skip to content

[BUG] Logs unable to reach OpenSearch #5275

Closed
@zandernelson

Description

Bug Report

Describe the bug
We have Fluentbit sidecars, the logs are unable to reach OpenSearch. We see no errors in Fluentbit logs. We do not understand what is happening because we see no errors in the Fluentbit container logs.

Fluentbit Sidecar Log Message:

Fluent Bit v1.9.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/11/16 00:34:42] [ info] Configuration:
[2022/11/16 00:34:42] [ info]  flush time     | 1.000000 seconds
[2022/11/16 00:34:42] [ info]  grace          | 5 seconds
[2022/11/16 00:34:42] [ info]  daemon         | 0
[2022/11/16 00:34:42] [ info] ___________
[2022/11/16 00:34:42] [ info]  inputs:
[2022/11/16 00:34:42] [ info]      tail
[2022/11/16 00:34:42] [ info]      tail
[2022/11/16 00:34:42] [ info] ___________
[2022/11/16 00:34:42] [ info]  filters:
[2022/11/16 00:34:42] [ info]      modify.0
[2022/11/16 00:34:42] [ info]      modify.1
[2022/11/16 00:34:42] [ info]      modify.2
[2022/11/16 00:34:42] [ info] ___________
[2022/11/16 00:34:42] [ info]  outputs:
[2022/11/16 00:34:42] [ info]      es.0
[2022/11/16 00:34:42] [ info]      splunk.1
[2022/11/16 00:34:42] [ info] ___________
[2022/11/16 00:34:42] [ info]  collectors:
[2022/11/16 00:34:42] [ info] [fluent bit] version=1.9.9, commit=, pid=89
[2022/11/16 00:34:42] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2022/11/16 00:34:42] [ info] [storage] version=1.3.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/11/16 00:34:42] [ info] [cmetrics] version=0.3.7
[2022/11/16 00:34:42] [debug] [tail:tail.0] created event channels: read=21 write=22
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] inotify watch fd=27
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] scanning path /var/log/iserver/ServerAudit*.log
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] cannot read info from: /var/log/iserver/ServerAudit*.log
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] 0 new files found on path '/var/log/iserver/ServerAudit*.log'
[2022/11/16 00:34:42] [debug] [tail:tail.1] created event channels: read=28 write=29
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] flb_tail_fs_inotify_init() initializing inotify tail input
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] inotify watch fd=34
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] scanning path /var/log/iserver/DSSErrors*.log
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] cannot read info from: /var/log/iserver/DSSErrors*.log
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] 0 new files found on path '/var/log/iserver/DSSErrors*.log'
[2022/11/16 00:34:42] [debug] [filter:modify:modify.0] Initialized modify filter with 0 conditions and 3 rules
[2022/11/16 00:34:42] [debug] [filter:modify:modify.1] Initialized modify filter with 0 conditions and 1 rules
[2022/11/16 00:34:42] [debug] [filter:modify:modify.2] Initialized modify filter with 0 conditions and 1 rules
[2022/11/16 00:34:42] [debug] [es:es.0] created event channels: read=35 write=36
[2022/11/16 00:34:42] [debug] [output:es:es.0] host=vpc-elasticsearchcl-redacted.es.amazonaws.com port=443 uri=/_bulk index=iserver-logs type=doc
[2022/11/16 00:34:42] [debug] [splunk:splunk.1] created event channels: read=47 write=48
[2022/11/16 00:34:42] [ info] [output:es:es.0] worker #0 started
[2022/11/16 00:34:42] [ info] [output:splunk:splunk.1] worker #1 started
[2022/11/16 00:34:42] [ info] [output:splunk:splunk.1] worker #0 started
[2022/11/16 00:34:42] [ info] [output:es:es.0] worker #1 started
[2022/11/16 00:34:42] [debug] [router] match rule tail.0:es.0
[2022/11/16 00:34:42] [debug] [router] match rule tail.0:splunk.1
[2022/11/16 00:34:42] [debug] [router] match rule tail.1:es.0
[2022/11/16 00:34:42] [ info] [sp] stream processor started
[2022/11/16 00:34:42] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2022/11/16 00:34:42] [debug] [input:tail:tail.1] [static files] processed 0b, done
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] scanning path /var/log/iserver/ServerAudit*.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] inode=5342925928760031785 with offset=0 appended as /var/log/iserver/ServerAudit-2022-11-16.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] scan_glob add(): /var/log/iserver/ServerAudit-2022-11-16.log, inode 5342925928760031785
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] 1 new files found on path '/var/log/iserver/ServerAudit*.log'
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] scanning path /var/log/iserver/DSSErrors*.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] inode=10747040000057398530 with offset=0 appended as /var/log/iserver/DSSErrors.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] scan_glob add(): /var/log/iserver/DSSErrors.log, inode -7699704073652153086
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] 1 new files found on path '/var/log/iserver/DSSErrors*.log'
[2022/11/16 00:35:42] [debug] [input chunk] update output instances with new chunk size diff=10262
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] [static files] processed 6.8K
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] inode=5342925928760031785 file=/var/log/iserver/ServerAudit-2022-11-16.log promote to TAIL_EVENT
[2022/11/16 00:35:42] [ info] [input:tail:tail.0] inotify_fs_add(): inode=5342925928760031785 watch_fd=1 name=/var/log/iserver/ServerAudit-2022-11-16.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2022/11/16 00:35:42] [debug] [input chunk] update output instances with new chunk size diff=53586
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] [static files] processed 31.8K
[2022/11/16 00:35:42] [debug] [input chunk] update output instances with new chunk size diff=53269
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] [static files] processed 31.9K
[2022/11/16 00:35:42] [debug] [input chunk] update output instances with new chunk size diff=3941
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] [static files] processed 2.5K
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] inode=10747040000057398530 file=/var/log/iserver/DSSErrors.log promote to TAIL_EVENT
[2022/11/16 00:35:42] [ info] [input:tail:tail.1] inotify_fs_add(): inode=10747040000057398530 watch_fd=1 name=/var/log/iserver/DSSErrors.log
[2022/11/16 00:35:42] [debug] [input:tail:tail.1] [static files] processed 0b, done
[2022/11/16 00:35:43] [debug] [task] created task=0x7fddc8e42d20 id=0 OK
[2022/11/16 00:35:43] [debug] [output:es:es.0] task_id=0 assigned to thread #0
[2022/11/16 00:35:43] [debug] [output:splunk:splunk.1] task_id=0 assigned to thread #0
[2022/11/16 00:35:43] [debug] [task] created task=0x7fddc8e42d90 id=1 OK
[2022/11/16 00:35:43] [debug] [output:es:es.0] task_id=1 assigned to thread #1
[2022/11/16 00:35:43] [debug] [http_client] not using http_proxy for header
[2022/11/16 00:35:43] [debug] [http_client] not using http_proxy for header
[2022/11/16 00:35:43] [debug] [upstream] KA connection #86 to ip-10-0-0-0.redacted.compute.internal:8088 is now available
[2022/11/16 00:35:43] [debug] [out flush] cb_destroy coro_id=0
[2022/11/16 00:35:43] [debug] [http_client] not using http_proxy for header
[2022/11/16 00:35:43] [debug] [output:es:es.0] HTTP Status=200 URI=/_bulk
[2022/11/16 00:35:43] [debug] [upstream] KA connection #87 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 is now available
[2022/11/16 00:35:43] [debug] [out flush] cb_destroy coro_id=0
[2022/11/16 00:35:43] [debug] [retry] new retry created for task_id=0 attempts=1
[2022/11/16 00:35:43] [ warn] [engine] failed to flush chunk '89-1668558942.65916959.flb', retry in 11 seconds: task_id=0, input=tail.0 > output=es.0 (out_id=0)
[2022/11/16 00:35:43] [debug] [output:es:es.0] HTTP Status=200 URI=/_bulk
[2022/11/16 00:35:43] [debug] [upstream] KA connection #88 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 is now available
[2022/11/16 00:35:43] [debug] [out flush] cb_destroy coro_id=0
[2022/11/16 00:35:43] [debug] [retry] new retry created for task_id=1 attempts=1
[2022/11/16 00:35:43] [ warn] [engine] failed to flush chunk '89-1668558942.70974100.flb', retry in 7 seconds: task_id=1, input=tail.1 > output=es.0 (out_id=0)
[2022/11/16 00:35:50] [debug] [output:es:es.0] task_id=1 assigned to thread #0
[2022/11/16 00:35:50] [debug] [upstream] KA connection #87 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 has been assigned (recycled)
[2022/11/16 00:35:50] [debug] [http_client] not using http_proxy for header
[2022/11/16 00:35:50] [debug] [output:es:es.0] HTTP Status=200 URI=/_bulk
[2022/11/16 00:35:50] [debug] [upstream] KA connection #87 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 is now available
[2022/11/16 00:35:50] [debug] [out flush] cb_destroy coro_id=1
[2022/11/16 00:35:50] [debug] [task] task_id=1 reached retry-attempts limit 1/1
[2022/11/16 00:35:50] [ warn] [engine] chunk '89-1668558942.70974100.flb' cannot be retried: task_id=1, input=tail.1 > output=es.0
[2022/11/16 00:35:50] [debug] [task] destroy task=0x7fddc8e42d90 (task_id=1)
[2022/11/16 00:35:54] [debug] [output:es:es.0] task_id=0 assigned to thread #1
[2022/11/16 00:35:54] [debug] [upstream] KA connection #88 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 has been assigned (recycled)
[2022/11/16 00:35:54] [debug] [http_client] not using http_proxy for header
[2022/11/16 00:35:54] [debug] [output:es:es.0] HTTP Status=200 URI=/_bulk
[2022/11/16 00:35:54] [debug] [upstream] KA connection #88 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 is now available
[2022/11/16 00:35:54] [debug] [out flush] cb_destroy coro_id=1
[2022/11/16 00:35:54] [debug] [task] task_id=0 reached retry-attempts limit 1/1
[2022/11/16 00:35:54] [ warn] [engine] chunk '89-1668558942.65916959.flb' cannot be retried: task_id=0, input=tail.0 > output=es.0
[2022/11/16 00:35:54] [debug] [task] destroy task=0x7fddc8e42d20 (task_id=0)
[2022/11/16 00:36:13] [debug] [upstream] drop keepalive connection #-1 to ip-10-0-0-0.redacted.compute.internal:8088 (keepalive idle timeout)
[2022/11/16 00:36:13] [debug] [socket] could not validate socket status for #86 (don't worry)
[2022/11/16 00:36:20] [debug] [upstream] drop keepalive connection #-1 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 (keepalive idle timeout)
[2022/11/16 00:36:20] [debug] [socket] could not validate socket status for #87 (don't worry)
[2022/11/16 00:36:25] [debug] [upstream] drop keepalive connection #-1 to vpc-elasticsearchcl-redacted.es.amazonaws.com:443 (keepalive idle timeout)
[2022/11/16 00:36:25] [debug] [socket] could not validate socket status for #88 (don't worry)

Fluentbit Configmap:

apiVersion: v1
metadata:
  name: iserver-fluentbit
data:
  fluent-bit.conf: """
  
    [SERVICE]
        Flush         1
        Log_Level     debug
        Daemon        off
        Parsers_File  parsers.conf

    [INPUT]
        Name tail
        Tag  iserver.audit.log
        Multiline Off
        Path /var/log/iserver/ServerAudit*.log

    [INPUT]
        Name tail
        Tag  iserver.dsserrors.log
        Multiline Off
        Path /var/log/iserver/DSSErrors*.log

    [FILTER]
        Name      modify
        Match     *
        Add       POD_NAME      {{ POD NAME }}
        Add       ENV_NAME      {{ RELEASE NAME }}
        Add       CLUSTER_NAME  {{ CLUSTER NAME }}

    [FILTER]
        Name modify
        Match iserver.dsserrors.log       
        Add    LOGNAME      dsserror

    [OUTPUT]
        Name    es
        Match   iserver.*.*
        Host    {{ OPENSEARCH HOST }}
        Port    {{ OPENSEARCH PORT }}
        Logstash_Format  True
        Logstash_Prefix  iserver-logs
        Index   iserver-logs
        Type    doc
        AWS_Auth Off
        AWS_Region {{ AWS REGION }}
        tls     On
        tls.verify Off

  """

  parsers.conf: |2

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>.*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex  ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S

    [PARSER]
        Name    kube-custom
        Format  regex
        Regex   (?<tag>[^.]+)?\.?(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

    [PARSER]
        Name dsserror
        Format  regex
        Time_key  time
        Regex  ^(?<time>[^\[]+)\s(\[HOST:(?<host>[^\]]+)\]\[SERVER:(?<server>[^\]]+)\]\[PID:(?<processid>[^\]]+)\]\[THR:(?<threadid>[^\]]+)\]\[([^\]]+)\]\[([^\]]+)\](\[([^\]]+)\])?\[UID:([^\]]+)\]\[SID:([^\]]+)\]\[OID:([^\]]+)\])\s*(?<message>.*)\s*(?<message>.*)\s*(?<message>.*)\s*(?<message>.*)\s*(?<message>.*)\s*(?<message>.*)\s*(?<message>.*)$
        Time_Format %Y-%m-%d %H:%M:%S.%L
        Time_Keep On

Expected behavior
Ability to send logs to OpenSearch, errors produced when unable to send logs to OpenSearch

Your Environment

  • Version used: fluent-bit:1.9.9
  • Configuration: config map provided
  • Environment name and version: Kubernetes 1.22

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions