Skip to content

ORDER BY expressions containing aggregation functions are not handled appropriately by the analyzer. #4485

@reltuk

Description

@reltuk

The following works because the SUM gets mapped directly to the projected SUM expression (by resolve_columns):

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) ASC

The following does not work, because the SUM(price) (sub)expression currently sticks around in the SortFields of the plan.Sort node, but it cannot evaluate correctly in this context:

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) + 1 ASC

Other things that should generally work but do not:

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY AVG(price) ASC

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY COUNT(*) ASC

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) % 2, SUM(price), AVG(price) ASC

In general, a correct way to handle an aggregation in a sort expression is to push the expression down to the group by node, replace the expression with an appropriately indexed GetField in the sort node itself, and project the expression away in a projection above the Sort node. The logic to do that does not currently exist in the analyzer.

This issue also might apply to window functions, but I have not investigated there yet.

For now, we are going to add a validation step that looks for aggregations outside of GroupBy expressions. If they exist, the query is unsupported and we will return an error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    analyzerbugSomething isn't workingsqlIssue with SQL

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions