Description
Describe the bug
When trying to process a string that contains an invalid byte sequence, fluentd
logs out an info level line that contains the original problematic string. If it is also handling its own log forwarding, this leads to recursive logging.
To Reproduce
-
Run Splunk Connect for Kubernetes in a kubernetes cluster
-
Produce an application log with a non-UTF-8 character
a. Fluentd picks it up as a part of SCK and produces the
invalid byte sequence is replaced
log lineb. That log line is picked up bu fluentd.
c. Repeats indefinitely
Expected behavior
The invalid byte sequence is replaced
log line is produced once.
Your Environment
- Fluentd version: 1.0 (from here)
- note: this is inferred from whatever Splunk are doing here
- TD Agent version: not sure
- Operating system: GKE 1.20.11-gke.1300
- Kernel version: COS-5.4.144
This is run as a part of Splunk Connect for Kubernetes v1.4.10
Your Configuration
Default helm installation for Splunk connect for Kubernetes.
ConfigMap from CSK: https://github.com/splunk/splunk-connect-for-kubernetes/blob/1.4.10/manifests/splunk-kubernetes-logging/configMap.yaml
Your Error Log
2022-01-05 14:55:38 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:37 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:36 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:35 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:34 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:33 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:32 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:31 +0000 [info]: #0 invalid byte sequence is replaced in `2022-01-05 14:55:30 +0000 [info]: #0 invalid byte sequence is replaced in ............ `{ "bytes_in": "483", "bytes_out": "233", "http_method": "GET", "status": "404", "uri_path": "/�", "uri_query": ""}`````````````````````````````````````````````````````````````````````````````````````````````````````````````````````
Additional context
The function that seems to be producing the issue is string_safe_encoding
. The log line is produced before the invalid character is replaced in the string.
I am also raising a PR with the simple fix.