Description
Found in 7.6.0-SNAPSHOT
It is possible to create and update a transform that uses a pipeline that does not exist.
For example:
POST _transform/transform-01/_update
{
"dest": {
"index" : "transform-01",
"pipeline" : "this-does-not-exist"
}
}
Returns acknowledged: true
If you start this transform, then we log audit messages saying:
Transform encountered an exception: org.elasticsearch.xpack.transform.transforms.ClientTransformIndexer$BulkIndexingException: Bulk index experienced failures. See the logs of the node running the transform for details. Will attempt again at next scheduled trigger.
And by looking in the logs on disk, we see
[2019-12-12T11:49:24,655][DEBUG][o.e.a.b.T.BulkRequestModifier] [node3] failed to execute pipeline [_none] for document [transform-01/_doc/MTWq3Ia7HoVvarOSxUzsmioAAAAAAAAA] java.lang.IllegalArgumentException: pipeline with id [this-does-not-exist] does not exist
This continuous transform failed after several minutes with task encountered more than 10 failures; latest failure:
Secondly, if the pipeline refers to a model that does not exist, then we have similar audit messages asking you to look in the log files, which have the error:
Caused by: org.elasticsearch.ResourceNotFoundException: Could not find trained model [this-model-does-not-exist]
This continuous transform also failed after several minutes with task encountered more than 10 failures; latest failure:
Should we:
- Validate a pipeline exists when creating or updating a transform
- Validate that a model exists when creating a inference processor pipeline
(although reindex does not do this.)
Regardless of whether we validate the pipeline and/or model exists, we should surface the errors as audit messages so that the user does not have to look at log files on disk.
(note,
if you use reindex with a pipeline that does not exist, it returns all docs as 400 failures
...
"failures" : [
{
"index" : "kibana_sample_data_flights_copy",
"type" : "_doc",
"id" : "3TDI6m4BXY3wzdkLGbWl",
"cause" : {
"type" : "illegal_argument_exception",
"reason" : "pipeline with id [pipeline-does-not-exist] does not exist"
},
"status" : 400
},
...
if you use reindex with a pipeline that exists but refers to a model that does not exist, it returns all docs as 404 failures
...
"failures" : [
{
"index" : "kibana_sample_data_flights_copy2",
"type" : "_doc",
"id" : "3jDI6m4BXY3wzdkLGbWl",
"cause" : {
"type" : "resource_not_found_exception",
"reason" : "Could not find trained model [model-does-not-exist]"
},
"status" : 404
},
...