Description
Component(s)
pkg/ottl, pkg/stanza, processor/transform, receiver/filelog
What happened?
Description
We have a pipeline that uses a filelog input with some operators that set certain fields in attributes. We then have a transform processor that reads those fields to set()
some Resource.attributes. The values that get set in the resource attributes of the log appear to be from a different log as if the transform is reading from one log and writing another.
Steps to Reproduce
I do not currently have a MVC for this issue, but I will include a stripped and sanitized config that contains the core logic/entities around the bug we are seeing.
Configuration
receivers:
filelog:
start_at: beginning
operators:
# route and parse log based on formatting
- id: get-format
type: router
routes:
- expr: body matches "^\\{"
output: parser-docker
- expr: body matches "^[^ Z]+ "
output: parser-crio
- expr: body matches "^[^ Z]+Z"
output: parser-containerd
- id: parser-crio
type: regex_parser
regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: 2006-01-02T15:04:05.999999999Z07:00
layout_type: gotime
parse_from: attributes.time
- id: crio-recombine
type: recombine
combine_field: attributes.log
combine_with: ""
is_last_entry: attributes.logtag == 'F'
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
- id: parser-containerd
type: regex_parser
regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
parse_from: attributes.time
- id: containerd-recombine
type: recombine
combine_field: attributes.log
combine_with: ""
is_last_entry: attributes.logtag == 'F'
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
- id: parser-docker
type: json_parser
output: extract_metadata_from_filepath
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
parse_from: attributes.time
- id: extract_metadata_from_filepath
parse_from: attributes["log.file.path"]
regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
type: regex_parser
# move log to body, parse to attributes
- type: move
from: attributes.log
to: body
- type: json_parser
parse_from: body
parse_to: attributes
processors:
transform/attribute_moves:
error_mode: ignore
log_statements:
- context: log
statements:
- set(attributes["log.iostream"], attributes["stream"])
- set(resource.attributes["k8s.container.name"], attributes["container_name"])
- set(resource.attributes["k8s.namespace.name"], attributes["namespace"])
- set(resource.attributes["k8s.pod.name"], attributes["pod_name"])
- set(resource.attributes["k8s.container.restart_count"], attributes["restart_count"])
- set(resource.attributes["k8s.pod.uid"], attributes["uid"])
Expected Result
Logs where attributes["pod_name"] == resource.attributes["k8s.pod.name"]
and resource.attributes["k8s.namespace.name"] == attributes["namespace"]
This is the case for most of the logs emitted from our pipeline
Actual Result
For some small percentage of our logs, the values of
resource.attributes["k8s.container.name"]
resource.attributes["k8s.namespace.name"]
resource.attributes["k8s.pod.name"]
resource.attributes["k8s.container.restart_count"]
resource.attributes["k8s.pod.uid"]
Do not match their attribute
counterparts, but instead appear to come from a different log event. The fields appear to be internally consistent, as if they all came from the same different log event. For example if resource.attributes["k8s.pod.name"]
is wrong and holds the name of some other pod, the resource.attributes["k8s.namespace.name"]
will have the namespace of that other pod.
Here is a sanitized example log json pulled from our elasticsearch
Sanitized incorrect log
{
"_index": ".ds-otel-logs-stg-my-app-2024.03.07-000022",
"_id": "Pz1km44B8ceE1n7Gbbpq",
"_version": 1,
"_score": 0,
"_source": {
"@timestamp": "2024-04-01T20:40:10.983479227Z",
"Attributes": {
"OTEL_collector_type": "daemonset",
"canonical": {
"owner": {
"team_name": "unknown"
}
},
"cloud": "aws",
"cluster": "ci-stg",
"container_name": "cilium-agent",
"elasticsearch": {
"index": {
"suffix": "my-app"
}
},
"environment": "stg",
"log": {
"file": {
"path": "/var/log/pods/kube-system_cilium-hlkmw_fb7fae7c-d3ac-4308-885d-5d1426c96bcc/cilium-agent/0.log"
},
"iostream": "stderr"
},
"logtag": "F",
"namespace": "kube-system",
"pod_name": "cilium-hlkmw",
"restart_count": "0",
"stream": "stderr",
"time": "2024-04-01T20:40:10.983479227Z",
"uid": "fb7fae7c-d3ac-4308-885d-5d1426c96bcc"
},
"Body": "level=info msg=\"Delete endpoint request\" containerID=593b7f5fe7 subsys=daemon",
"Resource": {
"k8s": {
"container": {
"name": "my-app",
"restart_count": "0"
},
"namespace": {
"name": "my-app-namespace"
},
"pod": {
"name": "my-app-65b4cf8769-bb66g",
"uid": "33b38675-15a7-4f64-a3f7-f38a97455ce6"
}
}
},
"Scope": {
"name": "",
"version": ""
},
"SeverityNumber": 0,
"TraceFlags": 0
},
"fields": {
"Attributes.uid": [
"fb7fae7c-d3ac-4308-885d-5d1426c96bcc"
],
"Attributes.stream": [
"stderr"
],
"Attributes.environment": [
"stg"
],
"Attributes.canonical.owner.team_name": [
"unknown"
],
"Resource.k8s.container.restart_count": [
"0"
],
"Attributes.log.iostream": [
"stderr"
],
"Attributes.restart_count": [
"0"
],
"Attributes.pod_name": [
"cilium-hlkmw"
],
"Resource.k8s.namespace.name": [
"my-app-namespace"
],
"TraceFlags": [
0
],
"Resource.k8s.pod.name": [
"my-app-65b4cf8769-bb66g"
],
"Scope.name": [
""
],
"Attributes.logtag": [
"F"
],
"Resource.k8s.pod.uid": [
"33b38675-15a7-4f64-a3f7-f38a97455ce6"
],
"Attributes.elasticsearch.index.suffix": [
"my-app"
],
"Attributes.cloud": [
"aws"
],
"SeverityNumber": [
0
],
"Attributes.log.file.path": [
"/var/log/pods/kube-system_cilium-hlkmw_fb7fae7c-d3ac-4308-885d-5d1426c96bcc/cilium-agent/0.log"
],
"Body": [
"level=info msg=\"Delete endpoint request\" containerID=593b7f5fe7 subsys=daemon"
],
"Attributes.time": [
"2024-04-01T20:40:10.983Z"
],
"Attributes.namespace": [
"kube-system"
],
"Resource.k8s.container.name": [
"my-app"
],
"@timestamp": [
"2024-04-01T20:40:10.983Z"
],
"Attributes.container_name": [
"cilium-agent"
]
}
}
Collector version
v0.96.0
Environment information
Environment
OS: Amazon Linux 2023 (Amazon's v1.25.16-eks ami)
Compiler(if manually compiled): Collector Helm Chart v0.84.0
OpenTelemetry Collector configuration
See Steps to Reproduce
Log output
No response
Additional context
We never experience this issue with these same moves to resources.attributes done via operators on the filelog receiver:
All operator solution
receivers:
filelog:
start_at: beginning
operators:
- id: get-format
routes:
- expr: body matches "^\\{"
output: parser-docker
- expr: body matches "^[^ Z]+ "
output: parser-crio
- expr: body matches "^[^ Z]+Z"
output: parser-containerd
type: router
- id: parser-crio
regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: 2006-01-02T15:04:05.999999999Z07:00
layout_type: gotime
parse_from: attributes.time
type: regex_parser
- combine_field: attributes.log
combine_with: ""
id: crio-recombine
is_last_entry: attributes.logtag == 'F'
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
type: recombine
- id: parser-containerd
regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
parse_from: attributes.time
type: regex_parser
- combine_field: attributes.log
combine_with: ""
id: containerd-recombine
is_last_entry: attributes.logtag == 'F'
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
type: recombine
- id: parser-docker
output: extract_metadata_from_filepath
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
parse_from: attributes.time
type: json_parser
- id: extract_metadata_from_filepath
parse_from: attributes["log.file.path"]
regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
type: regex_parser
- from: attributes.stream
to: attributes["log.iostream"]
type: move
- from: attributes.container_name
to: resource["k8s.container.name"]
type: move
- from: attributes.namespace
to: resource["k8s.namespace.name"]
type: move
- from: attributes.pod_name
to: resource["k8s.pod.name"]
type: move
- from: attributes.restart_count
to: resource["k8s.container.restart_count"]
type: move
- from: attributes.uid
to: resource["k8s.pod.uid"]
type: move
- from: attributes.log
to: body
type: move
- type: json_parser
parse_from: body
parse_to: attributes