Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Support One to One Inference in ML Inference Search Response Processor #2879

Closed
mingshl opened this issue Sep 3, 2024 · 2 comments
Closed
Assignees
Labels
2.17 enhancement New feature or request

Comments

@mingshl
Copy link
Collaborator

mingshl commented Sep 3, 2024

Is your feature request related to a problem?

Problem Statement

The current implementation of the ML Inference Search Response Processor in OpenSearch 2.16 supports many-to-one inference, where multiple documents are collected into a list and sent as a single prediction request to the machine learning model. However, there are scenarios where users may want to perform one-to-one inference, where each document is sent as a separate prediction request to the model.

Some use cases for one-to-one inference include:

Reranking: In reranking scenarios, such as using XGBoost for ranking, the model typically takes a single document and compares it with the search string to return a single score. Sending multiple documents in a single request may not be suitable for such use cases.

Models with Single Input: Some machine learning models, like the Bedrock embedding model, accept a single string as input. In such cases, sending multiple documents in a single request may not be compatible with the model's input requirements.

Customized Inference Logic: There may be scenarios where users need to perform customized inference logic on each document individually, which may not be possible with the many-to-one approach.

Solution Proposal

To address the need for one-to-one inference, we propose adding a new configuration option one_to_one to the ML Inference Search Response Processor. This option will allow users to specify whether they want to perform many-to-one inference (the current default behavior) or one-to-one inference.

When one_to_one is set to true, the processor will handle the search response as follows:

Separate the search response into individual one-hit search responses, where each response contains a single document.
For each one-hit search response, create a separate prediction request and send it to the machine learning model.
After receiving the prediction results for each document, combine the individual responses back into a single search response with the updated documents.
This approach ensures that each document is processed individually by the machine learning model, enabling support for use cases like reranking and models that accept single inputs.

What solution would you like?

The proposed solution will involve the following changes:

Modify the MLInferenceSearchResponseProcessor class to introduce the one_to_one configuration option and handle the logic for separating and combining search responses.
Update the processResponseAsync method to handle the one-to-one inference flow, including creating individual prediction requests and combining the results.
Introduce new helper methods or classes as needed to facilitate the separation and combination of search responses.
Update the documentation and examples to reflect the new one_to_one configuration option and its usage.
By implementing this solution, users will have the flexibility to choose between many-to-one inference (the current default behavior) and one-to-one inference, depending on their specific use case and model requirements.

Do you have any additional context?
META Issue](#2839)
[RFC for ML Inference Processors] #2173

@mingshl mingshl added enhancement New feature or request untriaged 2.17 and removed untriaged labels Sep 3, 2024
@mingshl mingshl self-assigned this Sep 3, 2024
@mingshl mingshl removed the untriaged label Sep 3, 2024
@mingshl
Copy link
Collaborator Author

mingshl commented Sep 3, 2024

#2801

@mingshl
Copy link
Collaborator Author

mingshl commented Sep 24, 2024

released in OS 2.17, closing.

@mingshl mingshl closed this as completed Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.17 enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant