Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fix issues in dynamically reading the number of allocations #115095

Open
wants to merge 2 commits into
base: 8.16
Choose a base branch
from

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Oct 18, 2024

Relates to problems in the GET inference API which should dynamically update the num_allocations field with the actual number from the deployed model. This is required for adaptive allocations where the field will change dynamically.

The num_allocations field in service_settings is updated with the current value.

GET _inference/elser_on_ml

{
  "endpoints": [
    {
      "inference_id": "elser_on_ml",
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_allocations": 3,            <-- this field is dynamic
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "deployment_id": ".elser_model_2_for_me"
      },
      "chunking_settings": {
        "strategy": "sentence",
        "max_chunk_size": 250,
        "sentence_overlap": 1
      }
    }
  ]
}

The first issue is that GroupedActionListener throws if called with size == 0. This is now protected against by skipping the model update if the list is empty.

The second issue is that the wrong field was being updated so the update was not seen in the API response. Tests are added to cover both cases.

Non issue as the code is not live

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team v8.16.1 labels Oct 18, 2024
@@ -126,6 +126,11 @@ private void getModelsByTaskType(TaskType taskType, ActionListener<GetInferenceM
}

private void parseModels(List<UnparsedModel> unparsedModels, ActionListener<GetInferenceModelAction.Response> listener) {
if (unparsedModels.isEmpty()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this check the GroupedActionListner was called with 0 requests which throws an exception

jgowdyelastic added a commit to elastic/kibana that referenced this pull request Oct 21, 2024
…ing inference endpoints (#196577)

When listing the inference endpoints available for the semantic text
field, we should only list `sparse_embedding` and `text_embedding`
types.

<img width="353" alt="image"
src="https://github.com/user-attachments/assets/95526f2b-e293-4e01-be79-b87e1ecb9a75">



This PR adds a check to the `data_visualizer/inference_endpoints`
endpoint to ensure only `sparse_embedding` and `text_embedding` types
are used and they have at least one allocation.
NOTE, the allocation check is currently commented out waiting on an es
change. elastic/elasticsearch#115095

Also renames the endpoint from `data_visualizer/inference_services` ->
`data_visualizer/inference_endpoints`
And renames variables which were incorrectly named "service" rather than
"endpoint"
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 21, 2024
…ing inference endpoints (elastic#196577)

When listing the inference endpoints available for the semantic text
field, we should only list `sparse_embedding` and `text_embedding`
types.

<img width="353" alt="image"
src="https://github.com/user-attachments/assets/95526f2b-e293-4e01-be79-b87e1ecb9a75">

This PR adds a check to the `data_visualizer/inference_endpoints`
endpoint to ensure only `sparse_embedding` and `text_embedding` types
are used and they have at least one allocation.
NOTE, the allocation check is currently commented out waiting on an es
change. elastic/elasticsearch#115095

Also renames the endpoint from `data_visualizer/inference_services` ->
`data_visualizer/inference_endpoints`
And renames variables which were incorrectly named "service" rather than
"endpoint"

(cherry picked from commit fb412ca)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 21, 2024
…ing inference endpoints (elastic#196577)

When listing the inference endpoints available for the semantic text
field, we should only list `sparse_embedding` and `text_embedding`
types.

<img width="353" alt="image"
src="https://github.com/user-attachments/assets/95526f2b-e293-4e01-be79-b87e1ecb9a75">

This PR adds a check to the `data_visualizer/inference_endpoints`
endpoint to ensure only `sparse_embedding` and `text_embedding` types
are used and they have at least one allocation.
NOTE, the allocation check is currently commented out waiting on an es
change. elastic/elasticsearch#115095

Also renames the endpoint from `data_visualizer/inference_services` ->
`data_visualizer/inference_endpoints`
And renames variables which were incorrectly named "service" rather than
"endpoint"

(cherry picked from commit fb412ca)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.16.0 v8.16.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants