Unbale to install fluent bit 3.2.2 version , throwing error - cannot open chunk: etc/machine-id #9801
Description
Bug Report
Describe the bug
After upgrading the Fluent Bit image from version 2.2.2 to 3.2.2, Fluent Bit fails to process log files properly, throwing a "Permission denied" error when attempting to access certain storage chunks. The error prevents the logging pipeline from continuing, resulting in the pausing of multiple input plugins, and log processing is halted.
Below are the error messages observed during runtime:
[2025/01/06 11:29:14] [ info] [input:storage_backlog:storage_backlog.7] register tail.2/1-1733390141.793171810.flb
[/src/fluent-bit/lib/chunkio/src/cio_file_unix.c:410 errno=13] Permission denied
[2025/01/06 11:29:14] [ info] [input:storage_backlog:storage_backlog.7] register tail.2/1-1733390437.65627129.flb
[2025/01/06 11:29:14] [error] [storage] [cio file] cannot open chunk: etc/machine-id
[2025/01/06 11:29:14] [error] [engine] could not segregate backlog chunks
[2025/01/06 11:29:14] [ info] [input] pausing container_logs
[2025/01/06 11:29:14] [ info] [input] pausing audit_logs
[2025/01/06 11:29:14] [ info] [input] pausing core_kubernetes_logs
[2025/01/06 11:29:14] [ info] [input] pausing core_kubernetes_logs
[2025/01/06 11:29:14] [ info] [input] pausing systemd.4
[2025/01/06 11:29:14] [ info] [input] pausing systemd.5
[2025/01/06 11:29:14] [ info] [input] pausing fluentbit_metrics.6
[2025/01/06 11:29:14] [ info] [input] pausing storage_backlog.7
[2025/01/06 11:29:14] [ info] [input] pausing emitter_for_log_to_metrics.0
[2025/01/06 11:29:14] [ info] [input] pausing emitter_for_log_to_metrics.1
[2025/01/06 11:29:14] [ info] [output:cloudwatch_logs:cloudwatch_logs.0] thread worker #0 stopping...
Key indicators of failure:
- Fluent Bit reports Permission denied while attempting to access chunk files.
- Logs indicate a failure in segregating storage backlog chunks.
- As a result, Fluent Bit pauses all input plugins (e.g., container_logs, audit_logs, systemd).
- The output plugins, like cloudwatch_logs, fail to send logs due to the stopped inputs.
To Reproduce
- Use Fluent Bit version 2.2.2 and ensure log ingestion is functioning correctly.
- Upgrade to Fluent Bit version 3.2.2 (or any 3.1.* version).
- Observe the logs during startup.
- Check for the error messages related to Permission denied and input plugin pausing.
**Your Environment**
<!--- Include as many relevant details about the environment you experienced the bug in -->
* Version used: 3.2.2
* Configuration:
* Environment name and version (e.g. Kubernetes? What version?): 1.3.0
* Server type and version:
* Operating System and version: suse- sle-micro-iso/5.5:2.0.4
* Filters and plugins:
inputs: |
[INPUT]
Name tail
Tag application.*
Path /var/log/containers/*.log
DB /var/fluent-bit/state/flb_container.db
Exclude_Path /var/log/containers/etcd*.log, /var/log/containers/kube-apiserver*.log, /var/log/containers/kube-controller-manager*.log, /var/log/containers/kube-proxy*.log, /var/log/containers/kube-scheduler*.log
Parser docker
Docker_Mode On
Skip_Long_Lines On
Refresh_Interval 10
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline
Rotate_Wait 30
storage.type filesystem
Alias container_logs
Read_from_Head Off
[INPUT]
Name tail
Alias audit_logs
Tag kubeaudit.*
Path /var/lib/rancher/rke2/server/logs/audit.log
Parser docker
DB /var/fluent-bit/state/audit_log.db
Skip_Long_Lines On
Refresh_Interval 10
Read_from_Head Off
Rotate_Wait 30
storage.type filesystem
[INPUT]
Name tail
Alias core_kubernetes_logs
Tag kubernetes.components.core.*
Path /var/log/containers/etcd*.log, /var/log/containers/kube-apiserver*.log, /var/log/containers/kube-controller-manager*.log, /var/log/containers/kube-proxy*.log, /var/log/containers/kube-scheduler*.log
Parser docker
DB /var/fluent-bit/state/core_kubernetes_logs.db
Skip_Long_Lines On
Refresh_Interval 10
Read_from_Head Off
storage.type filesystem
[INPUT]
Name tail
Alias core_kubernetes_logs
Tag kubernetes.components.kubelet.*
Path /var/lib/rancher/rke2/agent/logs/kubelet.log
Parser docker
DB /var/fluent-bit/state/kubelet_logs.db
Skip_Long_Lines On
Refresh_Interval 10
Read_from_Head Off
storage.type filesystem
[INPUT]
Name systemd
Tag sysd.auth
Systemd_Filter SYSLOG_FACILITY=4
Systemd_Filter SYSLOG_FACILITY=10
Systemd_Filter_Type Or
DB /var/fluent-bit/state/authsysd.db
Path /var/log/journal
storage.type filesystem
Read_from_Tail On
[INPUT]
Name systemd
Tag sysd.generic
DB /var/fluent-bit/state/genericsysd.db
Path /var/log/journal
storage.type filesystem
Read_from_Tail On
[INPUT]
Name fluentbit_metrics
Tag internal_metrics
# -- https://docs.fluentbit.io/manual/pipeline/filters
filters: |
[FILTER]
Name log_to_metrics
Match sysd.auth
Tag login_failure_metrics
Metric_mode counter
Metric_name os_login_failures
Metric_description This metric counts all OS login failures
Regex MESSAGE .*authentication failure.*
Label_field MESSAGE
[FILTER]
Name log_to_metrics
Match sysd.auth
Tag login_success_metrics
Metric_mode counter
Metric_name os_login_successes
Metric_description This metric counts all successful OS logins
Regex MESSAGE .*New session.*
Label_field MESSAGE
[FILTER]
Name parser
Match application.*
Key_name log
Parser crio
[FILTER]
Name grep
Match sysd.generic
Exclude SYSLOG_FACILITY (4|10)$
Regex PRIORITY [0-4]$
[FILTER]
Name kubernetes
Match application.*
Kube_URL https://kubernetes.default.svc:443
Merge_Log On
Merge_Log_Key log_processed
Keep_Log false
K8S-Logging.Parser On
K8S-Logging.Exclude false
Buffer_Size 0
Kube_Tag_Prefix application.var.log.containers.
Labels Off
Annotations Off
Use_Kubelet On
Kubelet_Port 10250
[FILTER]
Name kubernetes
Match kubernetes.components.core.*
Kube_URL https://kubernetes.default.svc:443
Merge_Log On
Merge_Log_Key log_processed
Keep_Log false
K8S-Logging.Parser On
K8S-Logging.Exclude false
Buffer_Size 0
Kube_Tag_Prefix kubernetes.components.core.var.log.containers.
Labels Off
Annotations Off
Use_Kubelet On
Kubelet_Port 10250
[FILTER]
Name modify
Match *
Add cluster_id ${CLUSTER_ID}
[FILTER]
Name modify
Match kubeaudit.*
Add host_name ${HOST_NAME}
[FILTER]
Name modify
Match kubernetes.components.kubelet.*
Add host_name ${HOST_NAME}
# -- https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
Name cloudwatch_logs
Match application.*
region {{ .Values.awsRegion }}
log_group_name /aws/containerinsights/${CLUSTER_ID}/application_logs
log_stream_prefix ${HOST_NAME}-
log_retention_days {{ .Values.logRetentionDays }}
auto_create_group true
Retry_Limit {{ .Values.retryLimit }}
storage.total_limit_size {{ .Values.containerLogsFileBufferLimit }}
[OUTPUT]
Name cloudwatch_logs
Match kubeaudit.*
region {{ .Values.awsRegion }}
log_group_name /aws/containerinsights/${CLUSTER_ID}/kubernetes_audit_logs
log_stream_prefix ${HOST_NAME}-
log_retention_days {{ .Values.auditLogRetentionDays }}
auto_create_group true
Retry_Limit {{ .Values.retryLimit }}
storage.total_limit_size {{ .Values.auditLogsFileBufferLimit }}
[OUTPUT]
Name cloudwatch_logs
Match kubernetes.components.*
region {{ .Values.awsRegion }}
log_group_name /aws/containerinsights/${CLUSTER_ID}/core_kubernetes_logs
log_stream_prefix ${HOST_NAME}-
log_retention_days {{ .Values.logRetentionDays }}
auto_create_group true
Retry_Limit {{ .Values.retryLimit }}
storage.total_limit_size {{ .Values.coreKubernetesLogsFileBufferLimit }}
[OUTPUT]
Name cloudwatch_logs
Match sysd.*
region {{ .Values.awsRegion }}
log_group_name /aws/containerinsights/${CLUSTER_ID}/operating_system_logs
log_stream_prefix ${HOST_NAME}-
log_retention_days {{ .Values.logRetentionDays }}
auto_create_group true
Retry_Limit {{ .Values.retryLimit }}
storage.total_limit_size {{ .Values.osLogsFileBufferLimit }}
[OUTPUT]
Name prometheus_exporter
Match *_metrics
# -- https://docs.fluentbit.io/manual/pipeline/parsers
customParsers: |
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
[PARSER]
Name crio
Format Regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
[PARSER]
Name container_firstline
Format regex
Regex (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ