Skip to content

Avoid check function type by matching names in the optimizer #18643

@2010YOUY01

Description

@2010YOUY01

Is your feature request related to a problem or challenge?

When implementing optimizer rules, it's common to check if a expression is certain function. Now it's commonly done through function name match check. A potential risk is: datafusion dependents might override the function implementation in the function registry, in such cases those optimizer rules should not be enabled, it's better to figure out a way to accurately check the function type.

I believe now they're done through name match because all function implementations live in a different crate than optimizer, and we don't want to add a crate dependency between them, so we can't do the trait object downcasting for type check.

One way I can think of is add a source() API in UDFs, so we can differentiate if this is builtin or added by the 3rd party, and in the optimizer we can check it safer like fun.source() == BuiltIn && fun.name() == "min"

Current optimizer examples for checking function type by name matching:

} else if func.name() != "sum"
&& func.name().to_lowercase() != "min"
&& func.name().to_lowercase() != "max"

// TODO: Do something better than name here should grouping be a built
// in expression?
matches!(expr, Expr::AggregateFunction(AggregateFunction { ref func, .. }) if func.name() == "grouping")

if func.name() == "count" {

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions