Closed
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Setup
$ export DATAFUSION_OPTIMIZER_SKIP_FAILED_RULES=false
$ echo "1,2" > test.csv
$ cd datafusion-cli
$ cargo run
❯ create external table test (a int, b int) stored as csv location 'test.csv';
Test
This query works:
❯ select * from test where a = (select max(a) from test t2);
+---+---+
| a | b |
+---+---+
| 1 | 2 |
+---+---+
These non-aggregate subqueries that return a single row do not work:
❯ select * from test where a = (select distinct a from test);
Internal("Optimizer rule 'scalar_subquery_to_join' failed due to unexpected error: scalar subqueries must have a projection at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/scalar_subquery_to_join.rs:185\ncaused by\nError during planning: Could not coerce into Projection! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1121")
❯ select * from test where a = (select a from test limit 1);
Internal("Optimizer rule 'scalar_subquery_to_join' failed due to unexpected error: scalar subqueries must have a projection at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/scalar_subquery_to_join.rs:185\ncaused by\nError during planning: Could not coerce into Projection! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1121")
In both cases, the subqueries do have a projection, but they are wrapped with a Distinct
or Limit
node.
This query fails with a different error. In this case there is a Filter
wrapping the Projection
.
❯ select * from test where a = (select a from test where a is not null);
Internal("Optimizer rule 'scalar_subquery_to_join' failed due to unexpected error: scalar subqueries must aggregate a value at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/scalar_subquery_to_join.rs:193\ncaused by\nError during planning: Could not coerce into Aggregate! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1368")
Describe the solution you'd like
Support these subqueries
Describe alternatives you've considered
None
Additional context
None