Skip to content

scan() cannot raise exception in case of not all shards being successful #660

Closed
@juke1

Description

@juke1

In the case of a RED status index, we do not get "failed" shards, we just get not all shards being successful e.g. JSON snippet below:

"_shards":{"total":20,"successful":7,"failed":0}

The scan() method allows a raise_on_exception parameter but this has the following issues:

  1. It only raises if a shard is in the "failed" state but cannot be used to raise when not all shards were successful

  2. It is only checked after yielding all successful docs, potentially meaning you scan through a huge number of docs and only get the exception at the end, when the server indicates shard status immediately in the HTTP response and so the exception could be raise immediately in principle.

  3. The response data is not accessible to the caller so it is not possible to check the shard status without re-implementing the entire method.

It would be great if the functionality could include the following:

  1. raise_on_error can cause an exception when not all shards are successful
  2. Exception is raised immediately before yielding docs

Relevant code is linked below:

https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py#L374-L388

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions