Closed
Description
If a transform task fails in the search phase due to a mapping conflict or a scripting error the error is handled as a temporary search problem, search is re-tried (10 times) and eventually the task is put into FAILED
state with reason: "task encountered more than 10 failures; latest failure: Partial shards failure", audit only contains "Partial shards failure".
The real issue can only be found in the logs, e.g.
Caused by: org.elasticsearch.ElasticsearchException$1: Fielddata is disabled on text fields by default. Set fielddata=true on [...] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
or
org.elasticsearch.script.ScriptException: runtime error
...
Caused by: java.lang.IllegalArgumentException: No field found for [field_b] in mapping
Solution
We need to unwrap search failures and check for inner problems:
- do not retry if it turns out to be a irrecoverable error (like we do for other errors like this)
- message the real error as reason in
_stats
and as audit message
Repro
Case 1
- create 2 indexes with 2 fields, use keyword fields:
field_a
,field_b
field_a
,field_c
- create a transform group by
field_a
with a scripted metric agg that accessesfield_b
without a guard:
"scripted_metric": {
"init_script": "state.b = new String()",
"map_script": "state.b = doc['field_b']",
"combine_script": "return state.b",
"reduce_script": "return states"
}
The transform should fail with a ScriptException
Case 2
- create 2 indexes with 2 fields, map
field_a
for the 2nd index totext
:field_a
,field_b
field_a
,field_b
- create a transform, group by
field_a
The transform should fail with an ElasticsearchException: Fielddata is disabled on text fields by default.
/CC @tsg