Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added ibm watsonx chat model and streaming to base llm model #3577

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

faileon
Copy link
Contributor

@faileon faileon commented Dec 7, 2023

Hi everyone,
as per the great PR #3399 based on the #2378 I added my implementation of ChatModel class and streaming capabilities to both base LLM model and the Chat Model.

I am not very familiar with turborepo, so I hope this structure is fine. Please let me know anything that should be changed.

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Dec 7, 2023
Copy link

vercel bot commented Dec 7, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 13, 2023 1:22am
langchainjs-docs ✅ Ready (Inspect) Visit Preview Dec 13, 2023 1:22am

@dosubot dosubot bot added the auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features label Dec 7, 2023
@@ -568,6 +568,9 @@
"chat_models/fake.cjs",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! I've noticed that this PR introduces new dependencies related to chat models. This comment is just to flag the change for maintainers to review the impact on peer/dev/hard dependencies. Great work on the PR!

@@ -0,0 +1,154 @@
import { getEnvironmentVariable } from "@langchain/core/utils/env";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey team, I've reviewed the code changes and noticed that environment variables are being accessed and read using getEnvironmentVariable and process.env. I've flagged this for your attention to ensure the handling of environment variables aligns with best practices. Great work overall!

@@ -1,49 +1,13 @@
import { BaseLLMCallOptions, BaseLLMParams, LLM } from "./base.js";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey team, I've flagged this PR for your review as it appears to add, access, or require environment variables for configuring the Watsonx AI interaction. Please take a look to ensure that the handling of environment variables is appropriate and secure.

@@ -0,0 +1,165 @@
import {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey team, this PR introduces a new external HTTP request using the fetch API in the WatsonApiClient class to obtain an IAM token from the IBM IAM token endpoint. This comment is flagging the change for maintainers to review the addition of this new request. Great work on the implementation!

@jacoblee93
Copy link
Collaborator

Thanks for your patience! @faileon @chasemcdo do you think you could have a last look? Moved around to account for recent refactoring

});
}

protected _formatMessagesAsPrompt(messages: BaseMessage[]): string {
Copy link
Collaborator

@jacoblee93 jacoblee93 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make a Llama adapter or something at some point...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely, I was thinking about it too. Right now this is blatantly copy pasted from the Ollama model and assumes the user only uses llama based chat model. However Watson offers the ability to run many models, including spinning any model from HF...


constructor(fields: WatsonxAIParams) {
super(fields);

this.region = fields?.region ?? this.region;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't removing all of these a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved a few attributes to the WatsonClient as it's something I also needed in the ChatModel. I did remove lc_serializable = true; by accident and will add it back in.

I tried running the provided example at examples/src/llms/watsonx_ai.ts and it works like before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is breaking in the sense that if someone is dependent on the old schema they'll need to update it, but most of the functionality is there, but in a now more reusable format.

One thing that I am noticing missing is there was originally an endpoint parameter which would allow the end user to overwrite the entire url if they so desired.

I had included this since the production endpoint is still labeled as "beta", so this would allow someone to migrate to a different (potentially more stable) endpoint as soon as it is released without the need to submit a PR to LangChain.

Thoughts on reintroducing that @faileon ?

Copy link
Collaborator

@jacoblee93 jacoblee93 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what current usage is, but would love a shim for principle's sake. I can try to have a look later if you don't have time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I get it now, yes this would be breaking to someone using the old schema. I can revert it and make it backwards compatible.

Regarding the endpoint parameter that's a good catch, I agree it should be configurable, as it will most likely change once IBM releases non beta version.

I have some free time during this weekend again to work on it.

Copy link
Collaborator

@jacoblee93 jacoblee93 Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is a very recent integration, but we really try to not have any breaking changes in patches unless they are completely unavoidable.

@jacoblee93 jacoblee93 added close PRs that need one or two touch-ups to be ready and removed lgtm PRs that are ready to be merged as-is labels Dec 13, 2023
Copy link
Contributor

@chasemcdo chasemcdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome! Thanks for reverse engineering that

urlTokenParams.append(
"grant_type",
"urn:ibm:params:oauth:grant-type:apikey"
async *_streamResponseChunks(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, but it seems like while we have the stream function for the LLM it isn't actually usable in the current state?

I was expecting something closer to the sagemaker implementation which would allow easy toggling of streaming.

Though the present _call function appears to directly call generate_text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally inspired by the way the ollama connector handles it, but let me take another look at this. I only tested the chat model I implemented for streaming and invoking.

top_k?: number;
top_p?: number;
repetition_penalty?: number;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had originally left this undefined to avoid missing parameters, but looks like you've done a great job of adding them!

One thing to add might be the Random seed not sure what the expected argument is off the top of my head.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I didn't see the Random seed. I will update the model.


constructor(fields: WatsonxAIParams) {
super(fields);

this.region = fields?.region ?? this.region;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is breaking in the sense that if someone is dependent on the old schema they'll need to update it, but most of the functionality is there, but in a now more reusable format.

One thing that I am noticing missing is there was originally an endpoint parameter which would allow the end user to overwrite the entire url if they so desired.

I had included this since the production endpoint is still labeled as "beta", so this would allow someone to migrate to a different (potentially more stable) endpoint as soon as it is released without the need to submit a PR to LangChain.

Thoughts on reintroducing that @faileon ?

Comment on lines +3 to +12
export interface WatsonModelParameters {
decoding_method?: "sample" | "greedy";
max_new_tokens?: number;
min_new_tokens?: number;
stop_sequences?: string[];
temperature?: number;
top_k?: number;
top_p?: number;
repetition_penalty?: number;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a bit more digging and looks like the Random Seed could be added like this:

Suggested change
export interface WatsonModelParameters {
decoding_method?: "sample" | "greedy";
max_new_tokens?: number;
min_new_tokens?: number;
stop_sequences?: string[];
temperature?: number;
top_k?: number;
top_p?: number;
repetition_penalty?: number;
}
export interface WatsonModelParameters {
decoding_method?: "sample" | "greedy";
max_new_tokens?: number;
min_new_tokens?: number;
stop_sequences?: string[];
temperature?: number;
top_k?: number;
top_p?: number;
repetition_penalty?: number;
random_seed?: number;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! I will update it.

Copy link
Collaborator

@bracesproul bracesproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this, code is looking pretty good! I left a few comments with some requests, once you've implemented please re-request my review

@@ -217,6 +217,7 @@ const entrypoints = {
"chat_models/llama_cpp": "chat_models/llama_cpp",
"chat_models/yandex": "chat_models/yandex",
"chat_models/fake": "chat_models/fake",
"chat_models/watsonx_ai": "chat_models/watsonx_ai",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer are adding integrations to the langchain package. Instead, only export inside langchain-community, from their create-entrypoints file and remove the re-exports from the main langchain package.
Thank you!

@@ -0,0 +1 @@
export * from "@langchain/community/chat_models/watsonx_ai";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop file

export class WatsonxAIChat extends SimpleChatModel {
private readonly watsonApiClient: WatsonApiClient;

readonly modelId!: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use definite assignment assertions, instead make it a non nullable type and add checks in the constructor that verify the value is defined.

These can come back to bite you in production (I've done this before and it was not fun to debug 😅)


readonly projectId!: string;

constructor(fields: WatsonxAIParams & BaseChatModelParams) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets redefine this as an interface above the class:

interface WatsonxAIChatParams extends WatsonxAIParams, BaseChatModelParams {};

This way it's easy to:

  1. export the class params type for use outside this file
  2. add extra properties in the future.

Also, I believe all the properties on those two interfaces are optional, so we should be able to do:

Suggested change
constructor(fields: WatsonxAIParams & BaseChatModelParams) {
constructor(fields?: WatsonxAIChatParams) {

return chunks.join("");
}

override async *_streamResponseChunks(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the override here

Comment on lines +106 to +108
_messages: BaseMessage[],
_options: this["ParsedCallOptions"],
_runManager?: CallbackManagerForLLMRun
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the underscore if they're being used. Typically, variables prefixed with an underscore are unused, and the underscore is used to bypass a lint rule for no-unused-variables

* The WatsonxAIParams interface defines the input parameters for
* the WatsonxAI class.
*/
export interface WatsonxAIParams extends BaseLLMParams {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move this interface declaration you'll need to re-export it from this file to keep backwards compatibility

export class WatsonApiClient {
private iamToken?: IAMTokenResponse;

private readonly apiKey!: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as comment above, no definite assignment assertions

Comment on lines +69 to +75
const formBody = Object.entries(payload).reduce((acc, [key, value]) => {
const encodedKey = encodeURIComponent(key as string);
const encodedValue = encodeURIComponent(value as string);
acc.push(`${encodedKey}=${encodedValue}`);
return acc;
}, [] as string[]);
const body = formBody.join("&");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be better handled with new URLSearchParams instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features close PRs that need one or two touch-ups to be ready question Further information is requested size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants