-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
When implementing optimizer rules, it's common to check if a expression is certain function. Now it's commonly done through function name match check. A potential risk is: datafusion dependents might override the function implementation in the function registry, in such cases those optimizer rules should not be enabled, it's better to figure out a way to accurately check the function type.
I believe now they're done through name match because all function implementations live in a different crate than optimizer, and we don't want to add a crate dependency between them, so we can't do the trait object downcasting for type check.
One way I can think of is add a source() API in UDFs, so we can differentiate if this is builtin or added by the 3rd party, and in the optimizer we can check it safer like fun.source() == BuiltIn && fun.name() == "min"
Current optimizer examples for checking function type by name matching:
datafusion/datafusion/optimizer/src/single_distinct_to_groupby.rs
Lines 90 to 92 in 0d52a1e
| } else if func.name() != "sum" | |
| && func.name().to_lowercase() != "min" | |
| && func.name().to_lowercase() != "max" |
datafusion/datafusion/optimizer/src/analyzer/resolve_grouping_function.rs
Lines 151 to 153 in 0d52a1e
| // TODO: Do something better than name here should grouping be a built | |
| // in expression? | |
| matches!(expr, Expr::AggregateFunction(AggregateFunction { ref func, .. }) if func.name() == "grouping") |
| if func.name() == "count" { |
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response