-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port ArrayHas family to functions-array
#9496
Conversation
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Contains #9477 |
} | ||
Ok(Transformed::no(expr)) | ||
} | ||
} | ||
|
||
#[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guojidan I'm thinking of importing udfs with feature flag like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if user disable array_expressions
featrue, and implement array function by themselves, this analyzer will not be called 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule is disabled, but other analyzer rules are not. As long as array_expressions
is disabled, related function should be disabled too. Even they implement their own array functions, those functions are not equal to our udf based array functions, so it makes sense to me that the rewrite rule is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems like ideally that users who want to supply their own implementations for array_expressions
should also supply their own Analyzer passes.
Following that logic, it seems like the datafusion_array_functions
crate should supply its own analyzer pass that had array function specific rewrites.
Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
#[cfg(feature = "array_expressions")] | |
// Note This rewrite is only done if the built in DataFusion `array_expressions` feature is enabled. | |
// Even if users implement their own array functions, those functions are not equal to the DataFusion | |
// udf based array functions, so this rewrite is not corrrect | |
#[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems like ideally that users who want to supply their own implementations for
array_expressions
should also supply their own Analyzer passes.Following that logic, it seems like the
datafusion_array_functions
crate should supply its own analyzer pass that had array function specific rewrites.Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
I'm not sure if there are any concrete cases, but if they can have an analyzer that not only deals with array_expressions
but also other expressions that would be more flexible.
However, the first step is to build an user-defined analyzer rule and find a way to register them to existing rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #9519
(it turns out that you can already define AnalyzerRules). I will make a PR to improve the documentation around this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR to improve documentation: #9520
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211 -- I think we could merge this PR in and then file tickets for moving the array_functions specific analyzer rules into the datafusion_functions_array
crate as well (see comments)
If you agree I will file the tickets
Thanks again
} | ||
Ok(Transformed::no(expr)) | ||
} | ||
} | ||
|
||
#[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems like ideally that users who want to supply their own implementations for array_expressions
should also supply their own Analyzer passes.
Following that logic, it seems like the datafusion_array_functions
crate should supply its own analyzer pass that had array function specific rewrites.
Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
#[cfg(feature = "array_expressions")] | |
// Note This rewrite is only done if the built in DataFusion `array_expressions` feature is enabled. | |
// Even if users implement their own array functions, those functions are not equal to the DataFusion | |
// udf based array functions, so this rewrite is not corrrect | |
#[cfg(feature = "array_expressions")] |
Which issue does this PR close?
Part of #9285
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?