Skip to content

Commit

Permalink
[pkg/stanza] Option for setting max log size in syslog parser (#33777)
Browse files Browse the repository at this point in the history
**Description:** add `MaxLogSize` parameter to `syslog` parser. Note
that for this option to be available, `enable_octet_counting` needs to
be set to `true`, as this is an option that is exclusive to the
`octetcounting` parser in the [syslog
library](github.com/leodido/go-syslog).

One aspect where I'm not sure about yet is regarding the placement of
the `max_log_size` option: Right now, this option is also set within the
`TCP` input configuration, whereas this new option would be one layer
above, i.e. in the syslog base config. This would mean that this option
could potentially be set to different values in the parser and tcp input
config like for example:

```
receivers:
  syslog:
    protocol: rfc5424
    enable_octet_counting: true
    max_log_size: 200000000 # 200MiB
    tcp:
      listen_address: :4278
      max_log_size: 100000000 # 100MiB
exporters:
  debug:
service:
  pipelines:
    logs:
      receivers: [syslog]
      exporters: [debug]
```

For now I have implemented this in a way where if nothing is set if the
tcp input config, the max_log_size value from the syslog base config
will be used. If set in the tcp config, the tcp input will use that more
specific value. To me this makes the most sense right now, but I
appreciate any feedback on this.

**Link to tracking Issue:** #33182

**Testing:** so far added unit test for the syslog parser, will also add
some tests for the syslog input config to test the behavior described
above.

**Documentation:** TODO, will add once we have figured out all open
questions

---------

Signed-off-by: Florian Bacher <florian.bacher@dynatrace.com>
  • Loading branch information
bacherfl authored Jul 8, 2024
1 parent 3e5c046 commit 0385b21
Show file tree
Hide file tree
Showing 6 changed files with 91 additions and 19 deletions.
27 changes: 27 additions & 0 deletions .chloggen/syslog_receiver_max_length.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: syslogreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Allow to define `max_octets` for octet counting RFC5424 syslog parser"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [33182]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
1 change: 1 addition & 0 deletions pkg/stanza/operator/input/syslog/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ func (c Config) Build(set component.TelemetrySettings) (operator.Operator, error
syslogParserCfg.BaseConfig = c.BaseConfig
syslogParserCfg.SetID(inputBase.ID() + "_internal_parser")
syslogParserCfg.OutputIDs = c.OutputIDs
syslogParserCfg.MaxOctets = c.MaxOctets
syslogParser, err := syslogParserCfg.Build(set)
if err != nil {
return nil, fmt.Errorf("failed to resolve syslog config: %w", err)
Expand Down
2 changes: 2 additions & 0 deletions pkg/stanza/operator/parser/syslog/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ type BaseConfig struct {
EnableOctetCounting bool `mapstructure:"enable_octet_counting,omitempty"`
AllowSkipPriHeader bool `mapstructure:"allow_skip_pri_header,omitempty"`
NonTransparentFramingTrailer *string `mapstructure:"non_transparent_framing_trailer,omitempty"`
MaxOctets int `mapstructure:"max_octets,omitempty"`
}

// Build will build a JSON parser operator.
Expand Down Expand Up @@ -105,5 +106,6 @@ func (c Config) Build(set component.TelemetrySettings) (operator.Operator, error
enableOctetCounting: c.EnableOctetCounting,
allowSkipPriHeader: c.AllowSkipPriHeader,
nonTransparentFramingTrailer: c.NonTransparentFramingTrailer,
maxOctets: c.MaxOctets,
}, nil
}
17 changes: 14 additions & 3 deletions pkg/stanza/operator/parser/syslog/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ type Parser struct {
enableOctetCounting bool
allowSkipPriHeader bool
nonTransparentFramingTrailer *string
maxOctets int
}

// Process will parse an entry field as syslog.
Expand Down Expand Up @@ -96,7 +97,7 @@ func (p *Parser) buildParseFunc() (parseFunc, error) {
switch {
// Octet Counting Parsing RFC6587
case p.enableOctetCounting:
return newOctetCountingParseFunc(), nil
return newOctetCountingParseFunc(p.maxOctets), nil
// Non-Transparent-Framing Parsing RFC6587
case p.nonTransparentFramingTrailer != nil && *p.nonTransparentFramingTrailer == LFTrailer:
return newNonTransparentFramingParseFunc(nontransparent.LF), nil
Expand Down Expand Up @@ -291,13 +292,23 @@ func postprocess(e *entry.Entry) error {
return cleanupTimestamp(e)
}

func newOctetCountingParseFunc() parseFunc {
func newOctetCountingParseFunc(maxOctets int) parseFunc {
return func(input []byte) (message sl.Message, err error) {
listener := func(res *sl.Result) {
message = res.Message
err = res.Error
}
parser := octetcounting.NewParser(sl.WithBestEffort(), sl.WithListener(listener))

parserOpts := []sl.ParserOption{
sl.WithBestEffort(),
sl.WithListener(listener),
}

if maxOctets > 0 {
parserOpts = append(parserOpts, sl.WithMaxMessageLength(maxOctets))
}

parser := octetcounting.NewParser(parserOpts...)
reader := bytes.NewReader(input)
parser.Parse(reader)
return
Expand Down
30 changes: 30 additions & 0 deletions pkg/stanza/operator/parser/syslog/parser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,36 @@ func TestSyslogParseRFC5424_SDNameTooLong(t *testing.T) {
}
}

func TestSyslogParseRFC5424_Octet_Counting_MessageTooLong(t *testing.T) {
cfg := basicConfig()
cfg.Protocol = RFC5424
cfg.EnableOctetCounting = true
cfg.MaxOctets = 214

body := `215 <86>1 2015-08-05T21:58:59.693Z 192.168.2.132 SecureAuth0 23108 ID52020 [SecureAuth@27389 UserHostAddress="192.168.2.132" Realm="SecureAuth0" UserID="Tester2" PEN="27389"] Found the user for retrieving user's profile`

set := componenttest.NewNopTelemetrySettings()
op, err := cfg.Build(set)
require.NoError(t, err)

fake := testutil.NewFakeOutput(t)
err = op.SetOutputs([]operator.Operator{fake})
require.NoError(t, err)

newEntry := entry.New()
newEntry.Body = body
err = op.Process(context.Background(), newEntry)
require.Error(t, err)
require.Contains(t, err.Error(), "message too long to parse. was size 215, max length 214")

select {
case e := <-fake.Received:
require.Equal(t, body, e.Body)
case <-time.After(time.Second):
require.FailNow(t, "Timed out waiting for entry to be processed")
}
}

func TestSyslogProtocolConfig(t *testing.T) {
for _, proto := range []string{"RFC5424", "rfc5424", "RFC3164", "rfc3164"} {
cfg := basicConfig()
Expand Down
33 changes: 17 additions & 16 deletions receiver/syslogreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,23 @@ Parses Syslogs received over TCP or UDP.

## Configuration

| Field | Default | Description |
|-------------------------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `tcp` | `nil` | Defined tcp_input operator. (see the TCP configuration section) |
| `udp` | `nil` | Defined udp_input operator. (see the UDP configuration section) |
| `protocol` | required | The protocol to parse the syslog messages as. Options are `rfc3164` and `rfc5424` |
| `location` | `UTC` | The geographic location (timezone) to use when parsing the timestamp (Syslog RFC 3164 only). The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
| `enable_octet_counting` | `false` | Wether or not to enable [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
| `allow_skip_pri_header` | `false` | Allow parsing records without the PRI header. If this setting is enabled, messages without the PRI header will be successfully parsed. The `SeverityNumber` and `SeverityText` fields as well as the `priority` and `facility` attributes will not be set on the log record. If this setting is disabled (the default), messages without PRI header will throw an exception. To set this setting to `true`, the `enable_octet_counting` setting must be `false`.|
| `non_transparent_framing_trailer` | `nil` | The framing trailer, either `LF` or `NUL`, when using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.2) Non-Transparent-Framing (Syslog RFC 5424 and TCP only). |
| `attributes` | {} | A map of `key: value` labels to add to the entry's attributes |
| `resource` | {} | A map of `key: value` labels to add to the entry's resource |
| `operators` | [] | An array of [operators](../../pkg/stanza/docs/operators/README.md#what-operators-are-available). See below for more details |
| `retry_on_failure.enabled` | `false` | If `true`, the receiver will pause reading a file and attempt to resend the current batch of logs if it encounters an error from downstream components. |
| `retry_on_failure.initial_interval` | `1 second` | Time to wait after the first failure before retrying. |
| `retry_on_failure.max_interval` | `30 seconds` | Upper bound on retry backoff interval. Once this value is reached the delay between consecutive retries will remain constant at the specified value. |
| `retry_on_failure.max_elapsed_time` | `5 minutes` | Maximum amount of time (including retries) spent trying to send a logs batch to a downstream consumer. Once this value is reached, the data is discarded. Retrying never stops if set to `0`. |
| Field | Default | Description |
|-------------------------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `tcp` | `nil` | Defined tcp_input operator. (see the TCP configuration section) |
| `udp` | `nil` | Defined udp_input operator. (see the UDP configuration section) |
| `protocol` | required | The protocol to parse the syslog messages as. Options are `rfc3164` and `rfc5424` |
| `location` | `UTC` | The geographic location (timezone) to use when parsing the timestamp (Syslog RFC 3164 only). The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
| `enable_octet_counting` | `false` | Wether or not to enable [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
| `max_octets` | `8192` | The maximum octets for messages using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
| `allow_skip_pri_header` | `false` | Allow parsing records without the PRI header. If this setting is enabled, messages without the PRI header will be successfully parsed. The `SeverityNumber` and `SeverityText` fields as well as the `priority` and `facility` attributes will not be set on the log record. If this setting is disabled (the default), messages without PRI header will throw an exception. To set this setting to `true`, the `enable_octet_counting` setting must be `false`. |
| `non_transparent_framing_trailer` | `nil` | The framing trailer, either `LF` or `NUL`, when using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.2) Non-Transparent-Framing (Syslog RFC 5424 and TCP only). |
| `attributes` | {} | A map of `key: value` labels to add to the entry's attributes |
| `resource` | {} | A map of `key: value` labels to add to the entry's resource |
| `operators` | [] | An array of [operators](../../pkg/stanza/docs/operators/README.md#what-operators-are-available). See below for more details |
| `retry_on_failure.enabled` | `false` | If `true`, the receiver will pause reading a file and attempt to resend the current batch of logs if it encounters an error from downstream components. |
| `retry_on_failure.initial_interval` | `1 second` | Time to wait after the first failure before retrying. |
| `retry_on_failure.max_interval` | `30 seconds` | Upper bound on retry backoff interval. Once this value is reached the delay between consecutive retries will remain constant at the specified value. |
| `retry_on_failure.max_elapsed_time` | `5 minutes` | Maximum amount of time (including retries) spent trying to send a logs batch to a downstream consumer. Once this value is reached, the data is discarded. Retrying never stops if set to `0`. |

### Operators

Expand Down

0 comments on commit 0385b21

Please sign in to comment.