Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitespaces are always trimmed #18242

Closed
newly12 opened this issue Feb 2, 2023 · 3 comments · Fixed by #18266
Closed

whitespaces are always trimmed #18242

newly12 opened this issue Feb 2, 2023 · 3 comments · Fixed by #18266
Labels
bug Something isn't working receiver/filelog

Comments

@newly12
Copy link
Contributor

newly12 commented Feb 2, 2023

Component(s)

receiver/filelog

What happened?

Description

whitespaces are always trimmed and potential issue while working with other operators

Steps to Reproduce

given CRI logs below, the 2nd line ends with a whitespace

cat -evt /tmp/container.log
2023-02-01T19:59:48.437970951-07:00 stderr F foo$
2023-02-01T19:59:48.437953934-07:00 stderr F $
$

Expected Result

whitespaces are not trimmed, so the regexp for CRI logs would work properly

Actual Result

got error because the 2nd line(last whitespace) got trimmed

Collector version

v0.70.0

Environment information

n/a

OpenTelemetry Collector configuration

receivers:
  filelog:
    include: [ /tmp/container.log ]
    include_file_path: true
    include_file_name: true
    start_at: beginning
    operators:
      - type: regex_parser
        id: parser-crio
        regex: ^(?P<go_timestamp_field>[^ Z]+) (?P<filename>stdout|stderr) (?P<logtag>[^ ]*) (?P<log>.*)$
        timestamp:
          parse_from: attributes.go_timestamp_field
          layout_type: gotime
          layout: 2006-01-02T15:04:05.999999999Z07:00
      - type: move
        id: move_body
        from: attributes.log
        to: body

Log output

2023-02-02T15:28:04.832+0800    error   helper/transformer.go:110       Failed to process entry {"kind": "receiver", "name": "filelog/1.log", "pipeline": "logs", "operator_id": "parser-crio", "operator_type": "regex_parser", "error": "regex pattern does not match", "action": "send", "entry": {"observed_timestamp":"2023-02-02T15:28:04.832791+08:00","timestamp":"0001-01-01T00:00:00Z","body":"2023-02-01T19:59:48.437953934-07:00 stderr F","attributes":{"log.file.name":"container.log","log.file.path":"/tmp/container.log"},"severity":0,"scope_name":""}}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*TransformerOperator).HandleEntryError
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/helper/transformer.go:110
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ParseWith
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/helper/parser.go:151
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ProcessWithCallback
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/helper/parser.go:123
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ProcessWith
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/helper/parser.go:109
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/regex.(*Parser).Process
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/parser/regex/regex.go:110
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*WriterOperator).Write
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/helper/writer.go:64
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file.(*Input).emit
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/operator/input/file/file.go:65
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Reader).ReadToEnd
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/fileconsumer/reader.go:87
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Manager).consume.func1
        /Users/yundeng/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.70.0/fileconsumer/file.go:138

Additional context

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/operator/helper/multiline.go#L201-L206

@newly12 newly12 added bug Something isn't working needs triage New item requiring triage labels Feb 2, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski djaglowski removed the needs triage New item requiring triage label Feb 2, 2023
@djaglowski
Copy link
Member

I think the answer here is that we should add a preserve_whitespace setting to pkg/stanza/fileconsumer.Config. This setting would need to be passed into NewNewlineSplitFunc and cause us to skip calls to trimWhitespace.

@newly12
Copy link
Contributor Author

newly12 commented Feb 3, 2023

Thanks @djaglowski , filed a PR for this, I only added one UT for now, please have a look at the initial implementation and let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/filelog
Projects
None yet
2 participants