We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
false
DataFusion tries to avoid doing work when at all possible to improve query performance
Part of this work is to determine when filters can never be true and avoid doing work
For example
DataFusion CLI v34.0.0 ❯ create table t(x int) as values (1), (2), (3); 0 rows in set. Query took 0.003 seconds.
When DataFusion sees a filter that can't be true it skips even scanning the data
❯ explain select x from t where false; +---------------+---------------+ | plan_type | plan | +---------------+---------------+ | logical_plan | EmptyRelation | | physical_plan | EmptyExec | | | | +---------------+---------------+ 2 rows in set. Query took 0.001 seconds.
However, it currently does not skip scanning if the filter is NULL (which also can't be true). Note the MemoryExec is still present:
MemoryExec
❯ explain select x from t where null::bool; +---------------+---------------------------------------------------+ | plan_type | plan | +---------------+---------------------------------------------------+ | logical_plan | Filter: Boolean(NULL) | | | TableScan: t projection=[x] | | physical_plan | CoalesceBatchesExec: target_batch_size=8192 | | | FilterExec: NULL | | | MemoryExec: partitions=1, partition_sizes=[1] | | | | +---------------+---------------------------------------------------+
I would like to avoid scanning when the filter evaluates to NULL in addition to false (the second example above should not have a MemoryExec in it)
No response
I think this is a good first issue that should be relatively simple to implement and would be a good introduction to DataFusion
The text was updated successfully, but these errors were encountered:
NULL
Successfully merging a pull request may close this issue.
Is your feature request related to a problem or challenge?
DataFusion tries to avoid doing work when at all possible to improve query performance
Part of this work is to determine when filters can never be true and avoid doing work
For example
When DataFusion sees a filter that can't be true it skips even scanning the data
However, it currently does not skip scanning if the filter is NULL (which also can't be true). Note the
MemoryExec
is still present:Describe the solution you'd like
I would like to avoid scanning when the filter evaluates to NULL in addition to
false
(the second example above should not have aMemoryExec
in it)Describe alternatives you've considered
No response
Additional context
I think this is a good first issue that should be relatively simple to implement and would be a good introduction to DataFusion
The text was updated successfully, but these errors were encountered: