Skip to content

Create a split event processor #4089

@dlvenable

Description

@dlvenable

Is your feature request related to a problem? Please describe.

Some users want to split an input event into multiple events by splitting a field from the input event.

Say I have the following two events:

{"query" : "open source", "some_other_field" : "abc" }
{"query" : "data prepper documentation", "some_other_field" : "xyz" }

I'd like to get the following events:

{"query" : "open", "some_other_field" : "abc" }
{"query" : "source", "some_other_field" : "abc" }
{"query" : "data, "some_other_field" : "xyz" }
{"query" : "prepper", "some_other_field" : "xyz" }
{"query" : "documentation", "some_other_field" : "xyz" }

Describe the solution you'd like

Create a split event processor.

It will require a field which is the field we are splitting on. The value of that field could be either a string or an array. When it is a string, the user must provide a delimiter. This could be expressed as concrete value or a regex.

Example 1: Split events from query based on a regex:

processor:
- split_event:
    field: query
    regex_delimiter: '\\s+'

Example 2: Split events from query based on a delimiter:

processor:
- split_event:
    field: query
    delimiter: ' '

Example 3: Split events from an array query. In this example, we are using split_string first, which would be unnecessary. But, it conveys how the processor works in the case of arrays.

processor:
- split_string:
        entries:
          - source: "query"
            delimiter_regex: "\\s+"
- split_event:
    field: query

Metadata

Metadata

Assignees

No one assigned

    Labels

    plugin - processorA plugin to manipulate data in the data prepper pipeline.

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions