Skip to content

Grok Processor does not support non-(a-zA-Z_) field characters for field names #21745

Closed
@talevy

Description

@talevy

example Ingest pipeline that fails: @ field name.

POST _ingest/pipeline/_simulate
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "grok" : {
          "field" : "foo",
          "patterns" : [
            "%{WORD:@}"
          ]
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_type": "type",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    }
  ]
}

exception:

{
  "docs": [
    {
      "error": {
        "root_cause": [
          {
            "type": "exception",
            "reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
            "header": {
              "processor_type": "grok"
            }
          }
        ],
        "type": "exception",
        "reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "Provided Grok expressions do not match field value: [bar]"
          }
        },
        "header": {
          "processor_type": "grok"
        }
      }
    }
  ]
}

The Grok Parser in Ingest requires that field names match a-zA-Z_, this should be expanded to support all unicode characters.

Must update the regex here to do so: https://github.com/talevy/elasticsearch/blob/82f7bfad98253e94305136df481cd1c7dc4e8ca8/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/Grok.java#L47-L47

might be a relevant Issue in Joni: jruby/joni#13

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions