-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Refine Type Handling in OpenAI Embedding Operator to Match OpenAI Typings #35547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine Type Handling in OpenAI Embedding Operator to Match OpenAI Typings #35547
Conversation
…ation that it matches the type we need, and aligned docstring for the openai operator accordingly
|
@potiuk hi! can you please review my changes in your free time? :) |
Generally - unless you have good reasons to know someone SHOULD review something we do not have a custom to call people in comments. ESPECIALLY if you opened your PR few hours before. People here contribute and review when they have free time and see where they can help. By calling me (or anyone else) directly - you also limit your options of getting review from other people, because they might think I am already involved and engaged with a PR, where I am not. So please avoid doing that unless you have very good reason to believe called person is absolutely necessary and SHOULD review the PR for whatever reason. |
I apologize for any inconvenience my direct request may have caused. It was not my intention to limit the review opportunities by others or to impose unnecessarily. I appreciate the collaborative efforts here and will ensure to keep pr's open for the community to address collectively. Thank you for bringing this to my attention and sorry again! |
|
LGTM but I'd love @utkarsharma2 to take a look as this is so fresh contribution :) |
…o log the length of input_text instead of the whole text
|
@ylnsnv Overall changes look good to me, just a few minor things. Also, can you mention the docs where you found the specific types accepted by OpenAI API? just curious :) |
First of all, thank you for your feedback! I didn't find the specific type information in the OpenAI API documentation directly. Instead, I referred to the actual implementation of the OpenAI API in their codebase. There, I found a line of code that clearly indicated the expected type. This discovery led me to open the PR to align Airflow's type indications with the actual implementation in the OpenAI API. I'm glad to contribute and ensure our codebase is as accurate and up-to-date as possible. Thank you again for your thorough review and for pointing out areas that needed clarification. Your input is greatly valued! :) |
…f input text to the constructor of the operator, changed value of invalid value to ValueError instead of AirflowException
utkarsharma2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
cc0682c to
4f0982d
Compare
|
Hi @potiuk , I hope I'm not intruding or overstepping again, but I wanted to quickly check if there's anything else needed on my PR for it to be merge-ready. Thanks for your time! |
|
LGTM :) |
Summary
This PR refines the
OpenAIEmbeddingOperatorto accurately reflect and support the extended range of input types it can handle for generating embeddings (by OpenAI official API). The operator, which previously accepted a string or a list of any type, now has improved type annotations and validation to handle strings, lists of strings, lists of integers, and lists of lists of integers more explicitly.Changes
str | list[Any]to bestr | list[str] | list[int] | list[list[int]]input_textis non-empty and matches one of the expected types.Impact
The operator's enhanced input type support provides users with clearer guidance on what data can be passed for embedding generation. This update facilitates the use of the operator in a broader range of scenarios, such as processing numerical data or sequences of data, which is common in machine learning and NLP tasks.
Tests
With these changes, the
OpenAIEmbeddingOperatorbecomes more intuitive and versatile, allowing for seamless integration into various data processing pipelines within Airflow environments.