-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SortMerge join support for IS NOT DISTINCT FROM. #16003
Conversation
The patch adds a "requiredNonNullKeyParts" field to the sortMerge processor, which has the list of key parts that must be nonnull for an equijoin condition to match. Conditions with SQL "=" are present in the list; conditions with SQL "IS NOT DISTINCT FROM" are absent from the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! minor comments
if (keyParts.length == 0) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: redundant check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does allow us to skip calling getRowPositionInDataRegion
, which I thought might be useful.
@@ -138,21 +139,29 @@ public RowKey readKey(int row) | |||
} | |||
|
|||
@Override | |||
public boolean isCompletelyNonNullKey(int row) | |||
public boolean hasNonNullKeyParts(int row, int[] keyParts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sidenote, we should mention somewhere in Javadoc that compare
implementation assumes null == null
, therefore this method can be called to filter out those rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
new Object[]{"def", "def"}, | ||
new Object[]{"abc", "abc"} | ||
), | ||
0 | ||
) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to find a UT test case with is not distinct from
sql query with sort merge enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good news, testJoinWithExplicitIsNotDistinctFromCondition
is that test case 😄
This test was passing on master
because the join was getting switched quietly to broadcast. Now it actually runs as sort-merge, which is why I had to add sortIfSortBased
(the results are now in a different order).
The patch adds a "requiredNonNullKeyParts" field to the sortMerge processor, which has the list of key parts that must be nonnull for an equijoin condition to match. Conditions with SQL "=" are present in the list; conditions with SQL "IS NOT DISTINCT FROM" are absent from the list.