-
-
Notifications
You must be signed in to change notification settings - Fork 238
Replace false filter with an empty table with the schema of the filter. #1650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this direction. Good job coming up with this suggestion. It seems like a very simple change to avoid processing the filter expression and reading data from the table, and setting the schema avoids more complex (and error prone) updates to the plan nodes to remove or change projectors above the filter node. Seems like a great approach to me, but would be good to sanity check that @max-hoffman doesn't see some better approach we don't see yet.
It's probably worth adding some more test cases with some more exotic queries to make sure this approach holds up, but I know you were mostly looking for feedback on the overall direction right now.
| }, | ||
| { | ||
| // https://github.com/dolthub/dolt/issues/5522 | ||
| Query: "select * from mytable where exists (select * from othertable where 1 = 0)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- scopes: does the panic only happen for a subquery expression? Does it also happen for two or more nested subquery expressions? What about selecting from a subquery alias with a false filter? alternating subquery -> subquery expression and vice versa can be subject to different rule orderings
- different relational expression types: group by, window, CTE, recursive CTE, table functions, indexedTableAccess
- what if a HAVING expression is false
- what if we project
xfrom othertable, or1, or a synthetic set of columns likex+1 - what if there are nested empty function calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR updated with more tests. Responses to your questions below.
@fulghum and @max-hoffman , thanks for looking at this! In the process of adding more tests, I even found an unrelated CTE bug: dolthub/dolt#5549 🥳
scopes: does the panic only happen for a subquery expression?
Looks like this happens only with subqueries. A false filter on a join or a table does not produce this issue:
-- false filter on top of a table: WORKS
select a from ab where false;
-- false filter on top of a join: WORKS
select a, u from (ab cross join uv) where false;Does it also happen for two or more nested subquery expressions?
Multiple subqueries and nested subqueries also trigger this:
-- multiple subqueries: PANIC
select count(*) from ab where
exists (select * from xy where false)
or exists (select * from uv where false);
-- nested subqueries: PANIC
select count(*) from ab where
exists (select * from xy where
exists (select * from uv where false));What about selecting from a subquery alias with a false filter?
This seems to work without issue:
-- false filter with a subquery: WORKS
select *
from (select * from uv) a1
where a1.u = 1 AND false;alternating subquery -> subquery expression and vice versa can be subject to different rule orderings
Do you mean testing various nested subqueries, and specifically testing the order of the subqueries?
different relational expression types: group by, window, CTE, recursive CTE, table functions, indexedTableAccess
Added cases as tests where these are the filter's relations: query, group by, window, CTE, recursive CTE, table function.
Can you give an example of indexedTableAccess?
what if a HAVING expression is false
Do you mean something like this?
select x from xy as t1 group by t1.x having exists (select * from ab where 1 = 0);
That query panics in current dolt and works with this PR.
what if we project x from othertable, or 1, or a synthetic set of columns like x+1
Can you give an example of this?
what if there are nested empty function calls
Can you give an example of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about selecting from a subquery alias with a false filter?
This is what I had in mind:
select *
from (select * from uv where false) a1
where a1.u = 1;alternating subquery -> subquery expression and vice versa can be subject to different rule orderings
This is more what I had in mind:
select * from
(select * from xy
where x = 1 and
x in (select * from (select 1 where false)))The false expression is accessed by SUBQUERY_ALIAS->SUBQUERY_EXPR->SUBQUERY_ALIAS. You might imagine similar patterns like SUBQUERY_ALIAS->SUBQUERY_ALIAS->SUBQUERY_EXPR, SUBQUERY_EXPR->SUBQUERY_ALIAS->SUBQUERY_ALIAS, ...etc.
Can you give an example of indexedTableAccess?
It might not be possible, but something like this might create a Proj(SubqueryAlias(Filter(false, IndexedTableAccess(xy, x=2)):
select * from (select * from xy where false) s where s.x = 2
select x from xy as t1 group by t1.x having exists (select * from ab where 1 = 0);
That's a good one, another one might just be
select * from xy where x in (select 1 having false)what if we project x from othertable, or 1, or a synthetic set of columns like x+1
Adding projections can introduce different schemas than the base table
select * from xy where exists (select a, true, a*7 from ab where false)what if there are nested empty function calls
Can't exactly think of what I meant right now. Maybe 1=(2+(1-1)) as the expression. Maybe below and variations of several scopes being empty where the layered schemas might be weird, and/or not exists and not in expressions.
select * from xy
where not exists (
select * from ab
where false and exists (
select 1 where false
)
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these test cases. Thanks Max!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm agree with jason, more tests to make sure that the new schema doesn't have surprising effects for a variety of cases would be nice
- added extra tests - identified and fixed a bug when we were removing filters incorrectly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Fixes dolthub/dolt#5522