Skip to content

Add Ibm Granite Completion and Chat Completion support #129146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Evgenii-Kazannik
Copy link
Contributor

  • Extend watsonx ai with completion and chat completion tasks in order to use corresponding IBM Granite models

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 9, 2025
@AI-IshanBhatt AI-IshanBhatt added the :SearchOrg/Experiences Label for the Search Experiences team label Jun 10, 2025
Copy link
Contributor

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few suggestions.

* @return A formatted error message.
*/
public static String buildErrorMessage(TaskType requestType, String inferenceId) {
return format("Failed to send Ibm Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId);
Copy link
Contributor

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return format("Failed to send Ibm Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId);
return format("Failed to send IBM Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you

protected IbmWatsonxEmbeddingsRequestManager getEmbeddingsRequestManager(
IbmWatsonxEmbeddingsModel model,
Truncator truncator,
ThreadPool threadPool
) {
return new IbmWatsonxEmbeddingsRequestManager(model, truncator, threadPool);
}

/**
* Builds an error message for Ibm Watsonx actions.
Copy link
Contributor

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Builds an error message for Ibm Watsonx actions.
* Builds an error message for IBM Watsonx actions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tnx. Applied as suggested

import org.elasticsearch.xpack.inference.services.ibmwatsonx.embeddings.IbmWatsonxEmbeddingsModel;
import org.elasticsearch.xpack.inference.services.ibmwatsonx.rerank.IbmWatsonxRerankModel;

import java.util.Map;

/**
* Interface for creating {@link ExecutableAction} instances for Watsonx models.
Copy link
Contributor

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO
From here and further down in logs and javadoc "IBM Watsonx" should be used instead of "Ibm Watsonx". It should be human readable. While class and variable names should stay camel cased.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thank you


/**
* Accepts a visitor to create an executable action. The returned action will not return documents in the response.
* @param visitor _
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen that underscore was used for some other param descriptions earlier, but it doesn't provide any useful information. I think it should be replaced with proper description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Jan. Done

/**
* Rate limits are defined at
* <a href="https://www.ibm.com/docs/en/watsonx/saas?topic=learning-watson-machine-plans">Watson Machine Learning plans</a>.
* For Lite plan, you've 120 requests per minute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original wording seems a bit off to me. I'd change rerank one as well.

Suggested change
* For Lite plan, you've 120 requests per minute.
* For the Lite plan, the limit is 120 requests per minute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I looked through and applied the suggestions.

  • unify naming: IBM Watsonx
  • use suggested comments
  • replace a visitor param undescore ( _ ) with a definition

public class IbmWatsonxActionCreator implements IbmWatsonxActionVisitor {
private final Sender sender;
private final ServiceComponents serviceComponents;

static final String COMPLETION_REQUEST_TYPE = "IBM WatsonX completions";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion in dms it was found that platform is called Watsonx not WatsonX. Could you please unify naming. Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Unified the naming as IBM Watsonx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a nitpick but the platform seems to be called IBM watsonx (from their website) but this change updates it everywhere to IBM Watsonx. Can we keep consistent with IBM's capitalization to avoid confusion?

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-experiences-team (Team:Search - Experiences)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@Samiul-TheSoccerFan Samiul-TheSoccerFan added Team:ML Meta label for the ML team and removed :SearchOrg/Experiences Label for the Search Experiences team labels Jun 10, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:ML Meta label for the ML team label Jun 10, 2025
@Samiul-TheSoccerFan Samiul-TheSoccerFan added :ml Machine learning Team:ML Meta label for the ML team and removed needs:triage Requires assignment of a team area label labels Jun 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

…hat-completion

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
Copy link
Member

@dan-rubinstein dan-rubinstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. I'd like to also manually test this change. Would you be able to provide some information in the PR description about how you manually tested such as some example API calls to help me get started with the testing?


@Override
public int rateLimitGroupingHash() {
return Objects.hash(uri);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this not need to include the rateLimitServiceSettings?


private final IbmWatsonxRateLimitServiceSettings rateLimitServiceSettings;

protected URI uri;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is URI only going to be used for completion/chat completion use cases? If yes, can it be in the completion model implementation instead?

import java.util.Map;
import java.util.Objects;

public abstract class IbmWatsonxModel extends Model {
public abstract class IbmWatsonxModel extends RateLimitGroupingModel {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why this needs to be a RateLimitGroupingModel?

public class IbmWatsonxActionCreator implements IbmWatsonxActionVisitor {
private final Sender sender;
private final ServiceComponents serviceComponents;

static final String COMPLETION_REQUEST_TYPE = "IBM WatsonX completions";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a nitpick but the platform seems to be called IBM watsonx (from their website) but this change updates it everywhere to IBM Watsonx. Can we keep consistent with IBM's capitalization to avoid confusion?

@@ -109,8 +109,8 @@ public DefaultSecretSettings getSecretSettings() {

/**
* Accepts a visitor to create an executable action. The returned action will not return documents in the response.
* @param visitor _
* @param taskSettings _
* @param visitor Interface for creating {@link ExecutableAction} instances for IBM Voyage AI models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should just be Voyage AI instead of IBM Voyage AI.

private static final String API_COMPLETIONS_PATH = "https://abc.com/ml/v1/text/chat?version=apiVersion";

public void testCreateRequest_WithStreaming() throws IOException, URISyntaxException {
var request = createRequest("secret", randomAlphaOfLength(15), "model", true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use randomized strings when possible? (ex. "secret", "model", etc)

private static final String AUTH_HEADER_VALUE = "foo";
private static final String API_COMPLETIONS_PATH = "https://abc.com/ml/v1/text/chat?version=apiVersion";

public void testCreateRequest_WithStreaming() throws IOException, URISyntaxException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a test for creating a request without streaming?

return new IbmWatsonxChatCompletionWithoutAuthRequest(new UnifiedChatInput(List.of(input), "user", stream), chatCompletionModel);
}

private static class IbmWatsonxChatCompletionWithoutAuthRequest extends IbmWatsonxChatCompletionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why we need to create a WithoutAuth version of the request?

import static org.elasticsearch.xpack.inference.MatchersUtils.equalToIgnoringWhitespaceInJsonString;
import static org.hamcrest.Matchers.is;

public class IbmWatsonxChatCompletionServiceSettingsTests extends AbstractWireSerializingTestCase<IbmWatsonxChatCompletionServiceSettings> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests for fromMap for the non-happy cases (ex. modeld missing, projectId missing, URL missing, etc.)? Same goes for cases where optional values aren't set (ex.falling back to default rate limit settings when none are provided)?

import static org.mockito.Mockito.doThrow;
import static org.mockito.Mockito.mock;

public class IbmWatsonxChatCompletionActionTests extends ESTestCase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this test class is either identical or almost identical to some of our other ...ChatCompletionActionTests classes with the exception of the createAction function and the responsJson we define (example). To reduce duplication can we create a base class ChatCompletionActionTests extends ESTestCase with all the shared code (ex. the tests) and just have each service have a IbmWatsonxChatCompletionActionTests extends ChatCompletionActionTests which overrides a createAction, createResponseJson, etc. set of functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :ml Machine learning Team:ML Meta label for the ML team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants