Skip to content

Convert Option<Vec<sort expression>> to Vec<sort expression> #16615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ViggoC
Copy link

@ViggoC ViggoC commented Jun 29, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate substrait Changes to the substrait crate proto Related to proto crate functions Changes to functions implementation labels Jun 29, 2025
@ViggoC
Copy link
Author

ViggoC commented Jun 29, 2025

@findepi Please take a look.

from_substrait_sorts(consumer, &f.sorts, input.schema())
.await?,
)
from_substrait_sorts(consumer, &f.sorts, input.schema()).await?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make from_substrait_sorts work with an empty slice?

Some(exprs) => create_physical_sort_exprs(
exprs,
let order_bys = if !order_by.is_empty() {
create_physical_sort_exprs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make create_physical_sort_exprs work with empty slices?

@alamb alamb requested a review from findepi June 30, 2025 22:40
/// List of expressions to feed to the functions as arguments
pub args: Vec<Expr>,
/// Optional filter
pub filter: Option<Box<Expr>>,
/// Optional ORDER BY applied prior to aggregating
pub order_by: Option<Vec<Expr>>,
pub order_by: Vec<Sort>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why the Optional was removed (that's the purpose of this PR). I don't yet understand why Expr was replaced with Sort thought.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a mistake during the batch replacement process, but since this struct has never been used in the project, it did not cause any exceptions. I'll change it back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we can open another issue to discuss whether it is more reasonable to change it to the Sort type, because other order_by fields are defined as the Sort type. Or we can directly delete this unused struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's unused, let's delete it, prefferrably in a separate PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Should it be before or after this PR?

pub fn first_value(expression: Expr, order_by: Option<Vec<SortExpr>>) -> Expr {
if let Some(order_by) = order_by {
pub fn first_value(expression: Expr, order_by: Vec<SortExpr>) -> Expr {
if !order_by.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems redundant now.

pub fn last_value(expression: Expr, order_by: Option<Vec<SortExpr>>) -> Expr {
if let Some(order_by) = order_by {
pub fn last_value(expression: Expr, order_by: Vec<SortExpr>) -> Expr {
if !order_by.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems redundant now.

&& a.nulls_first == b.nulls_first
&& a.expr.normalize_eq(&b.expr)
})
&& self_order_by.len() == other_order_by.len()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length check was missing, thanks for adding it.
Coincidentally, recently i was reflecting about precisely this -- how Iter::zip is error-prone to use.

@@ -821,7 +821,7 @@ impl ExprFuncBuilder {

let fun_expr = match fun {
ExprFuncKind::Aggregate(mut udaf) => {
udaf.params.order_by = order_by;
udaf.params.order_by = order_by.unwrap_or_default();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a follow-up, the builder order_by field can be changed to non-optional as well

@findepi
Copy link
Member

findepi commented Jul 2, 2025

@alamb this is a breaking change, right?
is there a label / process to be used for it?

@alamb alamb added the api change Changes the API exposed to users of the crate label Jul 2, 2025
@alamb
Copy link
Contributor

alamb commented Jul 2, 2025

@alamb this is a breaking change, right? is there a label / process to be used for it?

Probably -- we normally add the api change label, which I just did

This is the stated policy: https://datafusion.apache.org/contributor-guide/api-health.html

@alamb
Copy link
Contributor

alamb commented Jul 2, 2025

We have a long history of releasing breaking API changes so i think it is best just to use your judgement here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules proto Related to proto crate sql SQL Planner substrait Changes to the substrait crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert Option<Vec<sort expression>> to Vec<sort expression>
4 participants