Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 80 additions & 2 deletions _data-prepper/pipelines/contains.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
For example, if you want to check if the string `"abcd"` is contained within the value of a field named `message`, you can use the `contains()` function as follows:

```
contains('/message', 'abcd')
'contains(/message, "abcd")'
```
{% include copy.html %}

Expand All @@ -27,11 +27,89 @@
Alternatively, you can use a literal string as the first argument:

```
contains('This is a test message', 'test')
'contains("This is a test message", "test")'
```
{% include copy.html %}

In this case, the function returns `true` because the substring `test` is present within the string `This is a test message`.

The `contains()` function performs a case-sensitive search.
{: .note}

## Example

The following pipeline uses `contains()` to add a boolean flag `has_test` based on a substring in `/message` and to filter out non-matching events, forwarding only messages containing "ERROR" to OpenSearch:

Check failure on line 41 in _data-prepper/pipelines/contains.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_data-prepper/pipelines/contains.md", "range": {"start": {"line": 41, "column": 51}}}, "severity": "ERROR"}

```yaml
contains-demo-pipeline:
source:
http:
ssl: false

processor:
- add_entries:
entries:
- key: "has_test"
value_expression: 'contains(/message, "test")'
- drop_events:
drop_when: 'not contains(/message, "ERROR")'

sink:
- opensearch:
hosts: ["https://opensearch:9200"]
insecure: true
username: admin
password: "admin_pass"
index_type: custom
index: "demo-index-%{yyyy.MM.dd}"
```
{% include copy.html %}

You can test the pipeline using the following command:

```bash
curl -sS -X POST "http://localhost:2021/events" \
-H "Content-Type: application/json" \
-d '[
{"message":"ok hello"},
{"message":"this has test but ok"},
{"message":"ERROR: something bad"},
{"message":"ERROR: unit test failed"}
]'
```
{% include copy.html %}

The documents stored in OpenSearch contain the following information:

```json
{
...
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "demo-index-2025.10.21",
"_id": "5YACB5oBqZitdAAb4n3r",
"_score": 1,
"_source": {
"message": "ERROR: something bad",
"has_test": false
}
},
{
"_index": "demo-index-2025.10.21",
"_id": "5oACB5oBqZitdAAb4n3r",
"_score": 1,
"_source": {
"message": "ERROR: unit test failed",
"has_test": true
}
}
]
}
}
```
Loading