Description
Is your feature request related to a problem or challenge?
Motivation
There is ongoing work migrating BuitlinScalarFunction
s -> ScalarUDF
#8045, and we noticed one Expr
related issue during prototyping:
We can use Expr
API to create builtin scalar functions directly:
let expr1 = abs(lit(-1));
let expr2 = call_fn("abs", vec![lit(-1)]);
We want to still use this API after functions are migrated to ScalarUDF
based implementation, it's not possible now because those Expr
creations are stateless, and ScalarUDF
s are registered inside SessionState
. It's only possible if we do
let ctx = create_context()?;
ctx.register_udf(abs_impl);
let expr2 = ctx.call_fn("abs", vec![lit(-1)]);
Describe the solution you'd like
To continue supporting the original stateless Expr
creation API, we can create Expr
with only the string function name, and resolve the function during logical optimization.
Then call_fn("abs", vec![lit(-1)])
can be supported (now it only support BuitlinScalarFunction
and don't support UDFs)
Another macro-based API abs(lit(-1))
can be supported if we hard code all possible function names within the core (should we do that?)
Potential implementation is:
- After
let expr2 = call_fn("abs", vec![lit(-1)]);
, create aScalarUDF
expression with dummy implementation. - Add an
AnalyzerRule
(a mandatory logical optimizer rule) to resolve this UDF name using external functions registered inSessionState
Issue:
Now function implementation is inside SessionState
but outside SessionConfig
, and the analyzer can only access SessionConfig
.
We have to move the function registry into SessionConfig
first (or is there any better way?)
pub struct SessionState {
// ...
/// Scalar functions that are registered with the context
scalar_functions: HashMap<String, Arc<ScalarUDF>>,
/// Session configuration
config: SessionConfig,
// ...
// `self.options()` is `ConfigOption`
let analyzed_plan =
self.analyzer
.execute_and_check(plan, self.options(), |_, _| {})?;
cc @alamb I wonder if you have any thoughts on this approach 🤔
Describe alternatives you've considered
No response
Additional context
No response