-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Assign partition index to geometries outside of KdbTree
Currently, if a geometry is outside the bounding box of a KdbTree, it is dropped: it's assigned an empty partition index array, which is unnested, resulting in the row being dropped. This can be an efficiency measure: if one side of the join is much smaller than the other, then the bounds will drop many rows before they are sent to the join worker. However, if the bounds are less than both the build- and probe-side of the join, then rows that would have matched in a non-partitioned join will be dropped when you partition the join. This makes the correctness of the partitioned join dependent on the partition chosen, which can lead to some surprising output changes that could be reasonably viewed as data loss. This commit changes the bounding box of the KdbTree to extend from -Infinity to +Infinity, so that all (non-empty) geometries will get at least one partition.
- Loading branch information
Showing
9 changed files
with
133 additions
and
127 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.