Skip to content

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Oct 21, 2024

Reinstates the ability to dynamically report the number of allocations for ml node models in the Inference API.

Dynamically reading the number of allocations in use on GET is useful for the case where adaptive allocations is enabled and if the model deployment is managed through ml trained models. The num_allocations field in service_settings is updated with the current value.

GET _inference/elser_on_ml

{
  "endpoints": [
    {
      "inference_id": "elser_on_ml",
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_allocations": 3,            <-- this field is dynamic
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "deployment_id": ".elser_model_2_for_me"
      },
      "chunking_settings": {
        "strategy": "sentence",
        "max_chunk_size": 250,
        "sentence_overlap": 1
      }
    }
  ]
}

The change was originally reverted in 2697f85 due to an error calling the GroupedActionListener with 0 request. That bug is fixed in cce77b9.

This RP also includes bug fixes forward ported from 8.16 added in #115095

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Oct 21, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

@davidkyle davidkyle added the auto-backport Automatically create backport pull requests when merged label Oct 22, 2024
Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit c7f53ff into elastic:main Oct 22, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 115233

jgowdyelastic added a commit to elastic/kibana that referenced this pull request Oct 24, 2024
Enables the previously commented out check for `num_allocations` when
listing the inference endpoints.

The adaptive allocation count can drop to 0, but it is still valid for
use. Uploading a file will cause it to be re-deployed.

Related to es PRs elastic/elasticsearch#115233
and elastic/elasticsearch#115095

Follow on from #196577
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 24, 2024
Enables the previously commented out check for `num_allocations` when
listing the inference endpoints.

The adaptive allocation count can drop to 0, but it is still valid for
use. Uploading a file will cause it to be re-deployed.

Related to es PRs elastic/elasticsearch#115233
and elastic/elasticsearch#115095

Follow on from elastic#196577

(cherry picked from commit 66b2447)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 24, 2024
Enables the previously commented out check for `num_allocations` when
listing the inference endpoints.

The adaptive allocation count can drop to 0, but it is still valid for
use. Uploading a file will cause it to be re-deployed.

Related to es PRs elastic/elasticsearch#115233
and elastic/elasticsearch#115095

Follow on from elastic#196577

(cherry picked from commit 66b2447)
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
…15233)

The GET inference API which should dynamically update the num_allocations field
with the actual number from the deployed model which is useful when adaptive 
allocations are used
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
…15233)

The GET inference API which should dynamically update the num_allocations field
with the actual number from the deployed model which is useful when adaptive 
allocations are used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >enhancement :ml Machine learning Team:ML Meta label for the ML team v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants