-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add Ibm Granite Completion and Chat Completion support #129146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Ibm Granite Completion and Chat Completion support #129146
Conversation
Evgenii-Kazannik
commented
Jun 9, 2025
- Extend watsonx ai with completion and chat completion tasks in order to use corresponding IBM Granite models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few suggestions.
* @return A formatted error message. | ||
*/ | ||
public static String buildErrorMessage(TaskType requestType, String inferenceId) { | ||
return format("Failed to send Ibm Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return format("Failed to send Ibm Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId); | |
return format("Failed to send IBM Watsonx %s request from inference entity id [%s]", requestType.toString(), inferenceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thank you
protected IbmWatsonxEmbeddingsRequestManager getEmbeddingsRequestManager( | ||
IbmWatsonxEmbeddingsModel model, | ||
Truncator truncator, | ||
ThreadPool threadPool | ||
) { | ||
return new IbmWatsonxEmbeddingsRequestManager(model, truncator, threadPool); | ||
} | ||
|
||
/** | ||
* Builds an error message for Ibm Watsonx actions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Builds an error message for Ibm Watsonx actions. | |
* Builds an error message for IBM Watsonx actions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tnx. Applied as suggested
import org.elasticsearch.xpack.inference.services.ibmwatsonx.embeddings.IbmWatsonxEmbeddingsModel; | ||
import org.elasticsearch.xpack.inference.services.ibmwatsonx.rerank.IbmWatsonxRerankModel; | ||
|
||
import java.util.Map; | ||
|
||
/** | ||
* Interface for creating {@link ExecutableAction} instances for Watsonx models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO
From here and further down in logs and javadoc "IBM Watsonx" should be used instead of "Ibm Watsonx". It should be human readable. While class and variable names should stay camel cased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Thank you
|
||
/** | ||
* Accepts a visitor to create an executable action. The returned action will not return documents in the response. | ||
* @param visitor _ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen that underscore was used for some other param descriptions earlier, but it doesn't provide any useful information. I think it should be replaced with proper description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Jan. Done
/** | ||
* Rate limits are defined at | ||
* <a href="https://www.ibm.com/docs/en/watsonx/saas?topic=learning-watson-machine-plans">Watson Machine Learning plans</a>. | ||
* For Lite plan, you've 120 requests per minute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original wording seems a bit off to me. I'd change rerank one as well.
* For Lite plan, you've 120 requests per minute. | |
* For the Lite plan, the limit is 120 requests per minute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I looked through and applied the suggestions.
- unify naming: IBM Watsonx
- use suggested comments
- replace a visitor param undescore ( _ ) with a definition
public class IbmWatsonxActionCreator implements IbmWatsonxActionVisitor { | ||
private final Sender sender; | ||
private final ServiceComponents serviceComponents; | ||
|
||
static final String COMPLETION_REQUEST_TYPE = "IBM WatsonX completions"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion in dms it was found that platform is called Watsonx not WatsonX. Could you please unify naming. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Unified the naming as IBM Watsonx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a nitpick but the platform seems to be called IBM watsonx
(from their website) but this change updates it everywhere to IBM Watsonx
. Can we keep consistent with IBM's capitalization to avoid confusion?
Pinging @elastic/search-experiences-team (Team:Search - Experiences) |
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/ml-core (Team:ML) |
…hat-completion # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. I'd like to also manually test this change. Would you be able to provide some information in the PR description about how you manually tested such as some example API calls to help me get started with the testing?
|
||
@Override | ||
public int rateLimitGroupingHash() { | ||
return Objects.hash(uri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this not need to include the rateLimitServiceSettings
?
|
||
private final IbmWatsonxRateLimitServiceSettings rateLimitServiceSettings; | ||
|
||
protected URI uri; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is URI only going to be used for completion/chat completion use cases? If yes, can it be in the completion model implementation instead?
import java.util.Map; | ||
import java.util.Objects; | ||
|
||
public abstract class IbmWatsonxModel extends Model { | ||
public abstract class IbmWatsonxModel extends RateLimitGroupingModel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why this needs to be a RateLimitGroupingModel
?
public class IbmWatsonxActionCreator implements IbmWatsonxActionVisitor { | ||
private final Sender sender; | ||
private final ServiceComponents serviceComponents; | ||
|
||
static final String COMPLETION_REQUEST_TYPE = "IBM WatsonX completions"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a nitpick but the platform seems to be called IBM watsonx
(from their website) but this change updates it everywhere to IBM Watsonx
. Can we keep consistent with IBM's capitalization to avoid confusion?
@@ -109,8 +109,8 @@ public DefaultSecretSettings getSecretSettings() { | |||
|
|||
/** | |||
* Accepts a visitor to create an executable action. The returned action will not return documents in the response. | |||
* @param visitor _ | |||
* @param taskSettings _ | |||
* @param visitor Interface for creating {@link ExecutableAction} instances for IBM Voyage AI models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should just be Voyage AI
instead of IBM Voyage AI
.
private static final String API_COMPLETIONS_PATH = "https://abc.com/ml/v1/text/chat?version=apiVersion"; | ||
|
||
public void testCreateRequest_WithStreaming() throws IOException, URISyntaxException { | ||
var request = createRequest("secret", randomAlphaOfLength(15), "model", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use randomized strings when possible? (ex. "secret", "model", etc)
private static final String AUTH_HEADER_VALUE = "foo"; | ||
private static final String API_COMPLETIONS_PATH = "https://abc.com/ml/v1/text/chat?version=apiVersion"; | ||
|
||
public void testCreateRequest_WithStreaming() throws IOException, URISyntaxException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a test for creating a request without streaming?
return new IbmWatsonxChatCompletionWithoutAuthRequest(new UnifiedChatInput(List.of(input), "user", stream), chatCompletionModel); | ||
} | ||
|
||
private static class IbmWatsonxChatCompletionWithoutAuthRequest extends IbmWatsonxChatCompletionRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why we need to create a WithoutAuth
version of the request?
import static org.elasticsearch.xpack.inference.MatchersUtils.equalToIgnoringWhitespaceInJsonString; | ||
import static org.hamcrest.Matchers.is; | ||
|
||
public class IbmWatsonxChatCompletionServiceSettingsTests extends AbstractWireSerializingTestCase<IbmWatsonxChatCompletionServiceSettings> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add tests for fromMap
for the non-happy cases (ex. modeld missing, projectId missing, URL missing, etc.)? Same goes for cases where optional values aren't set (ex.falling back to default rate limit settings when none are provided)?
import static org.mockito.Mockito.doThrow; | ||
import static org.mockito.Mockito.mock; | ||
|
||
public class IbmWatsonxChatCompletionActionTests extends ESTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this test class is either identical or almost identical to some of our other ...ChatCompletionActionTests
classes with the exception of the createAction
function and the responsJson
we define (example). To reduce duplication can we create a base class ChatCompletionActionTests extends ESTestCase
with all the shared code (ex. the tests) and just have each service have a IbmWatsonxChatCompletionActionTests extends ChatCompletionActionTests
which overrides a createAction
, createResponseJson
, etc. set of functions?