Skip to content

Conversation

@anadi45
Copy link
Contributor

@anadi45 anadi45 commented May 26, 2025

Fixes # (#8221)

  1. This would return base64 string and not decode it into number[]. (openai-node sdk behaviour)

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 26, 2025
@vercel
Copy link

vercel bot commented May 26, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ❌ Failed (Inspect) Jul 5, 2025 9:43am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jul 5, 2025 9:43am

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs May 26, 2025 08:09 Inactive
@anadi45
Copy link
Contributor Author

anadi45 commented Jun 12, 2025

@hntrl please check

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 12, 2025 17:18 Inactive
Copy link
Member

@hntrl hntrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing I'll note is that base64 outputs aren't going to be consistent with the types of the base embeddings class which probably isn't what we want (it indicates that embeddings are represented in num arrays, but oAI allows base64 outputs). CI will probably pass just because their latest types are reflective of the encoding format and we haven't bumped the sdk version yet. So a couple of things would need some attention before we could merge this:

(1) Bump the openai sdk version to latest to properly surface param types
(2) Take another look at the base class to see if we can better manage output types (i'd be amenable to introducting a generic w/ a default on the base class that determines output type)

@hntrl
Copy link
Member

hntrl commented Jun 12, 2025

will rerun when gcp reverts 🙈

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 24, 2025
@hntrl
Copy link
Member

hntrl commented Jun 24, 2025

@anadi45 Happy to merge this if we give some thought to what I described above. Out of curiosity what's the use case you have for base64 embeddings?

@anadi45
Copy link
Contributor Author

anadi45 commented Jun 24, 2025

Hi @hntrl,

I've updated the OpenAI SDK to the latest version, but it appears that the type definitions have not been updated accordingly within the package.

Regarding the alternative approach of updating the base embedding class to use generic types — I did explore that direction. However, I encountered a number of dependencies, particularly around the vector store classes, which made me think if they break existing features..

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 24, 2025 23:43 Inactive
@anadi45
Copy link
Contributor Author

anadi45 commented Jun 24, 2025

@hntrl I saw this as an open issue, and thought of solving it. I personally don't have a use case for this.

@hntrl
Copy link
Member

hntrl commented Jun 25, 2025

@anadi45 gotcha, I missed the linked issue! Thanks for taking the time to contribute!

I'll call out that the sdk version is up to date on main (we had some other things we needed from the latest openAI version). I imagine if the types haven't updated in the oAI sdk but it's reflected in the docs then it's just a change that got missed/ is waiting to land (there's more of those than you think). Maybe it's possible to have an infill type in the class signature that accepts the proper encoding format, then we can just nix it whenever openAI gets around to adding it?

If we make it a generic with a default then I imagine that won't be a breaking change. Could you give more insight into why you think it would be breaking?

- export interface EmbeddingsInterface {
+ export interface EmbeddingsInterface<TOutput = number[]> { 
  ...
-  embedDocuments(documents: string[]): Promise<number[][]>;
+  embedDocuments(documents: string[]): Promise<TOutput[]>;
  ...
-  embedQuery(document: string): Promise<number[]>;
+  embedQuery(document: string): Promise<TOutput>;
}

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 30, 2025 15:36 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs June 30, 2025 15:37 Inactive
@anadi45
Copy link
Contributor Author

anadi45 commented Jun 30, 2025

@hntrl have added generic type to base class

@vercel vercel bot temporarily deployed to Preview – langchainjs-docs July 3, 2025 03:26 Inactive
@hntrl
Copy link
Member

hntrl commented Sep 10, 2025

apologies, this got buried! I've rebased and continued this on in #8916

@hntrl hntrl closed this Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants