[ML] Improve logging for reindex with semantic_text fields

**Use Case:** Users can experience issues with reindexing data from an original index to a new index with embedding and semantic_text fields. The reindexing process starts but no documents appear in the destination index.

Users wanting to reindex (index size ~5k documents), mappings are different, in particular the source index has embedding field (it's ESS so we can see it):
```
        "embedding_field": {
          "type": "semantic_text",
          "inference_id": ".elser-2-elasticsearch"
        }
```

```
POST _reindex?wait_for_completion=false
{
"source": { "index": "my-source-index"},
"dest": { "index": "my-dest-index"}
}
```

reindex task continues without any issues and also without any progress: `GET _tasks/<task id>` will show the task.

ES logs will show a WARN:
```
[<time>][WARN ][org.elasticsearch.xpack.ml.inference.adaptiveallocations.AdaptiveAllocationsScalerService] [<instance>] adaptive allocations scaler: scaling [.elser-2-elasticsearch] to [4] allocations failed.
org.elasticsearch.ElasticsearchStatusException: Could not update deployment because there are not enough resources to provide all requested allocations
	at org.elasticsearch.xpack.ml.inference.assignment.TrainedModelAssignmentClusterService.increaseNumberOfAllocations(TrainedModelAssignmentClusterService.java:994) ~[?:?]
	at org.elasticsearch.xpack.ml.inference.assignment.TrainedModelAssignmentClusterService.lambda$updateAssignment$18(TrainedModelAssignmentClusterService.java:956) ~[?:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956) ~[elasticsearch-8.17.3.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
	at java.lang.Thread.run(Thread.java:1575) ~[?:?]

```

ML node size 4GB RAM:

<details>

<summary>GET _ml/trained_models/_stats</summary>

```
{
  "count": 3,
  "trained_model_stats": [
    {
      "model_id": ".elser_model_2",
      "model_size_stats": {
        "model_size_bytes": 438123914,
        "required_native_memory_bytes": 2101346304
      },
      "pipeline_count": 0,
      "inference_stats": {
        "failure_count": 0,
        "inference_count": 0,
        "cache_miss_count": 0,
        "missing_all_fields_count": 0,
        "timestamp": 1756456905350
      },
      "deployment_stats": {
        "deployment_id": ".elser-2-elasticsearch",
        "model_id": ".elser_model_2",
        "threads_per_allocation": 1,
        "number_of_allocations": 0,
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        },
        "queue_capacity": 10000,
        "state": "started",
        "allocation_status": {
          "allocation_count": 0,
          "target_allocation_count": 0,
          "state": "fully_allocated"
        },
        "cache_size": "417.8mb",
        "priority": "normal",
        "start_time": 1751635728892,
        "peak_throughput_per_minute": 0,
        "nodes": []
      }
    },
    {
      "model_id": ".elser_model_2_linux-x86_64",
      "model_size_stats": {
        "model_size_bytes": 274756282,
        "required_native_memory_bytes": 2101346304
      },
      "pipeline_count": 1,
      "ingest": {
        "total": {
          "count": 0,
          "time_in_millis": 0,
          "current": 0,
          "failed": 0
        },
        "pipelines": {
          ".kibana-observability-ai-assistant-kb-ingest-pipeline": {
            "count": 0,
            "time_in_millis": 0,
            "current": 0,
            "failed": 0,
            "ingested_as_first_pipeline_in_bytes": 0,
            "produced_as_first_pipeline_in_bytes": 0,
            "processors": [
              {
                "inference": {
                  "type": "inference",
                  "stats": {
                    "count": 0,
                    "time_in_millis": 0,
                    "current": 0,
                    "failed": 0
                  }
                }
              }
            ]
          }
        }
      },
      "inference_stats": {
        "failure_count": 0,
        "inference_count": 0,
        "cache_miss_count": 0,
        "missing_all_fields_count": 0,
        "timestamp": 1756456905350
      },
      "deployment_stats": {
        "deployment_id": "my-elser-endpoint",
        "model_id": ".elser_model_2_linux-x86_64",
        "threads_per_allocation": 1,
        "number_of_allocations": 1,
        "queue_capacity": 10000,
        "state": "started",
        "allocation_status": {
          "allocation_count": 1,
          "target_allocation_count": 1,
          "state": "fully_allocated"
        },
        "cache_size": "262mb",
        "priority": "normal",
        "start_time": 1750851473245,
        "peak_throughput_per_minute": 0,
        "nodes": [
        ]
      }
    },
    {
      "model_id": "lang_ident_model_1",
      "model_size_stats": {
        "model_size_bytes": 1053992,
        "required_native_memory_bytes": 0
      },
      "pipeline_count": 0
    }
  ]
}
```

</details>

So `.elser-2-elasticsearch` is not allocated . It is not obvious that ML node autoscaling must be enabled to scale up to handle the reindex.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Improve logging for reindex with semantic_text fields #134219

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ML] Improve logging for reindex with semantic_text fields #134219

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions