Description
In the case of a RED status index, we do not get "failed" shards, we just get not all shards being successful e.g. JSON snippet below:
"_shards":{"total":20,"successful":7,"failed":0}
The scan() method allows a raise_on_exception parameter but this has the following issues:
-
It only raises if a shard is in the "failed" state but cannot be used to raise when not all shards were successful
-
It is only checked after yielding all successful docs, potentially meaning you scan through a huge number of docs and only get the exception at the end, when the server indicates shard status immediately in the HTTP response and so the exception could be raise immediately in principle.
-
The response data is not accessible to the caller so it is not possible to check the shard status without re-implementing the entire method.
It would be great if the functionality could include the following:
- raise_on_error can cause an exception when not all shards are successful
- Exception is raised immediately before yielding docs
Relevant code is linked below:
https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py#L374-L388