feat: refactor udf/udaf/udwf ReturnType#8183
Conversation
alamb
left a comment
There was a problem hiding this comment.
Thank you @JasonLi-cn - this is a great start. THe complex_udf is especially nice as an example
CC @2010YOUY01 as I think you have been thinking about this area too
| } | ||
|
|
||
| /// Factory that returns the functions's return type given the input argument types | ||
| pub type ReturnTypeFunction = |
There was a problem hiding this comment.
I wonder if you considered changing the ReturnTypeFunction to something more like
Arc<dyn Fn(&[Expr], &dyn ExprSchemable) -> Result<Arc<DataType>> + Send + Sync>;
Now that we have made the fields of ScalarUDF non pub (in #8079) I think we have much greater leeway to improve the API
There was a problem hiding this comment.
Perhaps we could introduce the FunctionImplementation trait as proposed here: https://github.com/apache/arrow-datafusion/pull/8046/files#diff-8a327db2db945bcf6ca2b4229885532feae127e94a450600d3fac6ecdc0eeb3fR141
with the more general return types. Something like this perhaps:
/// Convenience trait for implementing ScalarUDF. See [`ScalarUDF::new_from_impl()`]
pub trait FunctionImplementation {
/// Returns this function's name
fn name(&self) -> &str;
/// Returns this function's signature
fn signature(&self) -> &Signature;
/// return the return type of this function given the types of the arguments
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>;
/// return the return type of this function given the actual arguments. This is
/// used to implement functions where the return type is a function of the actual
/// arguments
fn return_type_from_args(&self, args: &[Expr], schemabe: &dyn ExprSchemable) -> Result<DataType> {
// default impl would call `Self::return_type`
todo!()
}
/// Invoke the function on `args`, returning the appropriate result
fn invoke(&self, args: &[ColumnarValue]) -> Result<ColumnarValue>;
}|
Another idea @JasonLi-cn might be to pursue this idea #8051 to "specialize" the function based on its arguments (and in this case the specialization could potentially also include the constants 🤔 ) |
|
marking as draft as I think @JasonLi-cn is working on feedback now |
Ok, I will try to complete this task |
|
FOr anyone following along, this was implemented in a different way, see #8624 |
Which issue does this PR close?
Closes #8182
Rationale for this change
In some cases, I need to determine the output type of the function based on the input arguments of the function, but I cannot do this at present. This is because the ReturnTypeFunction provides only the data types of the input parameters.
What changes are included in this PR?
ReturnTypeFactorytraitReturnTypeFunctionimplReturnTypeFactoryreturn_typefromReturnTypeFunctionintoReturnTypeFactoryAre these changes tested?
Are there any user-facing changes?