Closed
Description
example Ingest pipeline that fails: @
field name.
POST _ingest/pipeline/_simulate
{
"pipeline" :
{
"description": "_description",
"processors": [
{
"grok" : {
"field" : "foo",
"patterns" : [
"%{WORD:@}"
]
}
}
]
},
"docs": [
{
"_index": "index",
"_type": "type",
"_id": "id",
"_source": {
"foo": "bar"
}
}
]
}
exception:
{
"docs": [
{
"error": {
"root_cause": [
{
"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
"header": {
"processor_type": "grok"
}
}
],
"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [bar]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Provided Grok expressions do not match field value: [bar]"
}
},
"header": {
"processor_type": "grok"
}
}
}
]
}
The Grok Parser in Ingest requires that field names match a-zA-Z_
, this should be expanded to support all unicode characters.
Must update the regex here to do so: https://github.com/talevy/elasticsearch/blob/82f7bfad98253e94305136df481cd1c7dc4e8ca8/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/Grok.java#L47-L47
might be a relevant Issue in Joni: jruby/joni#13