Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DISTINCT projections in decorrelate_where_exists #3724

Closed
Tracked by #822
andygrove opened this issue Oct 5, 2022 · 0 comments · Fixed by #3732
Closed
Tracked by #822

Add support for DISTINCT projections in decorrelate_where_exists #3724

andygrove opened this issue Oct 5, 2022 · 0 comments · Fixed by #3732
Labels
enhancement New feature or request optimizer Optimizer rules

Comments

@andygrove
Copy link
Member

andygrove commented Oct 5, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Setup

$ export DATAFUSION_OPTIMIZER_SKIP_FAILED_RULES=false
$ echo "1,2" > test.csv
$ cd datafusion-cli
$ cargo run
❯ create external table test (a int, b int) stored as csv location 'test.csv';

Test

This query works:

❯ select * from test where exists (select a from test t2 where test.a = t2.a);
+---+---+
| a | b |
+---+---+
| 1 | 2 |
+---+---+

If I add the DISTINCT keyword then the optimizer fails:

❯ select * from test where exists (select distinct a from test t2 where test.a = t2.a);
Internal("Optimizer rule 'decorrelate_where_exists' failed due to unexpected error: cannot optimize non-correlated subquery at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/decorrelate_where_exists.rs:141\ncaused by\nError during planning: Could not coerce into Filter! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1157")

Describe the solution you'd like
Support distinct projections in subqueries

Describe alternatives you've considered
None

Additional context
None

@andygrove andygrove added the enhancement New feature or request label Oct 5, 2022
@andygrove andygrove changed the title Add support for DISTINCT projections in scalar_subquery_to_join Add support for DISTINCT projections in subqueries Oct 5, 2022
@andygrove andygrove changed the title Add support for DISTINCT projections in subqueries Add support for DISTINCT projections in decorrelate_where_exists Oct 5, 2022
@andygrove andygrove added the optimizer Optimizer rules label Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimizer Optimizer rules
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant