Skip to content

Reranker retriever query fails if window size > top N in inference endpoint #111202

Closed
@demjened

Description

@demjened

Elasticsearch Version

8.15

Installed Plugins

No response

Java Version

bundled

OS Version

N/A

Problem Description

The text_similarity_reranker retriever query fails if rank_window_size is greater than top_n in the rerank inference endpoint's task settings.

When creating the inference endpoint we have the option to specify top_n to return only N documents. By default this is omitted and the issue doesn't occur. However if e.g. top_n = 10 is specified in the endpoint and the reranker query defines a rank_window_size greater than 10, the reranker process fails due to an array index of bounds error.

Steps to Reproduce

  1. Create a deployment (serverless or 8.15+).
  2. Index some documents, e.g. short passages with a text field.
  3. Create a rerank inference endpoint with top_n set in the task settings:
PUT _inference/rerank/cohere-rerank-inference-top-10
{
  "service": "cohere",
  "service_settings": {
    "model_id": "rerank-english-v3.0",
    "api_key":  <COHERE_API_KEY>
  },
  "task_settings": {
    "top_n": 10
  }
}
  1. Run a rerank retriever query with a window size larger than the top N value from above:
POST rerank/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "Most famous landmark in Paris"
            }
          }
        }
      },
      "rank_window_size": 20,
      "field": "text",
      "inference_id": "cohere-rerank-inference-top-10",
      "inference_text": "Most famous landmark in Paris"
    }
  },
  "size": 20
}

Expected: the query succeeds and returns the top 10 documents.

Observed: the query fails with an error similar to this:

{
  "error": {
    "root_cause": [],
    "type": "search_phase_execution_exception",
    "reason": "Computing updated ranks for results failed",
    "phase": "rank-feature",
    "grouped": true,
    "failed_shards": [],
    "caused_by": {
      "type": "array_index_out_of_bounds_exception",
      "reason": "Index 16 out of bounds for length 10"
    }
  },
  "status": 500
}

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions