-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement geom_equals
and binary predicates that depend only on it.
#926
Implement geom_equals
and binary predicates that depend only on it.
#926
Conversation
… instead of __init__.
Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>
Co-authored-by: Mark Harris <mharris@nvidia.com>
…spatial into feature/all_equals_operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into the implementation yet, but I have some question regarding test coverage. Geometry equivalence can be vertex order invariant. Just want to make sure these are tested too.
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binops.py
Outdated
Show resolved
Hide resolved
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binops.py
Outdated
Show resolved
Hide resolved
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binops.py
Outdated
Show resolved
Hide resolved
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binops.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that unlike other features in cuSpatial, this one has very complex logic that is implemented only in Python, and not in our base C++. So supporting it in other languages, like Java, is not possible without rewriting a LOT of code.
It's also hard to reason about what and where computation is happening in CUDA.
# Group them by the original index, and sum the results. If the | ||
# sum of points in the rhs feature is equal to the number of | ||
# points found in the polygon, then the polygon contains the | ||
# feature. | ||
df_result = ( | ||
result.groupby("idx").sum().sort_index() | ||
== result.groupby("idx").count().sort_index() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to count the number of points in a feature? Isn't it just the size of the feature, which is stored? (Constant time vs. linear time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case code complexity is reduced by using the point_indices
object that is an argument to postprocess
. Instead of having to handle all types and combinations in each postprocess, the relevant indices for a given row of the GeoSeries
, which vary across geometry types, are computed in the preprocess
step and then carried through the BinaryPredicate
operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it just the size of the feature, which is stored?
The size of each feature in the column isn't stored. But can probably be easily computed with offest[1:] - offset[:-1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you meant to compute "is all points of the feature contained in the polygon". Isn't it just a logical_or
to PiP result of the same feature, aka a groupby max
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the result of BOOL8
is actually uint8
, if you do sum
and there are more than 256 points in the multipoint, will it overflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the preprocess step,
cuspatial/python/cuspatial/cuspatial/core/binpreds/binpreds.py
Lines 172 to 197 in b7d4d17
# RHS conditioning: | |
point_indices = None | |
# point in polygon | |
if contains_only_linestrings(rhs): | |
# condition for linestrings | |
geom = rhs.lines | |
elif contains_only_polygons(rhs) is True: | |
# polygon in polygon | |
geom = rhs.polygons | |
elif contains_only_multipoints(rhs) is True: | |
# mpoint in polygon | |
geom = rhs.multipoints | |
else: | |
# no conditioning is required | |
geom = rhs.points | |
xy_points = geom.xy | |
# Arrange into shape for calling point-in-polygon, intersection, or | |
# equals | |
point_indices = geom.point_indices() | |
from cuspatial.core.geoseries import GeoSeries | |
final_rhs = GeoSeries( | |
GeoColumn._from_points_xy(xy_points._column) | |
).points | |
return (lhs, final_rhs, point_indices) |
GeoSeries
of points. At that time point_indices
is also recorded. In order to not use point_indices here that logic would have to be reproduced in the postprocess
as well.
{"idx": indices[: len(result)], "equals": result} | ||
) | ||
gb_idx = result_df.groupby("idx") | ||
result = (gb_idx.sum().sort_index() == gb_idx.count().sort_index())[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(a)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit picking more corner cases for geom_equal
and want more tests for overlaps
and crossings
. But ok to punt for future if need to meet timing requirement.
# Group them by the original index, and sum the results. If the | ||
# sum of points in the rhs feature is equal to the number of | ||
# points found in the polygon, then the polygon contains the | ||
# feature. | ||
df_result = ( | ||
result.groupby("idx").sum().sort_index() | ||
== result.groupby("idx").count().sort_index() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it just the size of the feature, which is stored?
The size of each feature in the column isn't stored. But can probably be easily computed with offest[1:] - offset[:-1]
# Group them by the original index, and sum the results. If the | ||
# sum of points in the rhs feature is equal to the number of | ||
# points found in the polygon, then the polygon contains the | ||
# feature. | ||
df_result = ( | ||
result.groupby("idx").sum().sort_index() | ||
== result.groupby("idx").count().sort_index() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you meant to compute "is all points of the feature contained in the polygon". Isn't it just a logical_or
to PiP result of the same feature, aka a groupby max
?
# Group them by the original index, and sum the results. If the | ||
# sum of points in the rhs feature is equal to the number of | ||
# points found in the polygon, then the polygon contains the | ||
# feature. | ||
df_result = ( | ||
result.groupby("idx").sum().sort_index() | ||
== result.groupby("idx").count().sort_index() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the result of BOOL8
is actually uint8
, if you do sum
and there are more than 256 points in the multipoint, will it overflow?
|
||
def _sort_linestrings(self, lhs, rhs, initial): | ||
"""Swap first and last values of each linestring to ensure that | ||
the first point is the lowest value. This is necessary to ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you define the lowest value
of a point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to reply to all of the comments so I'll address them here.
-
It is possible with a pile of conditional logic to compute the group sizes via their offsets buffers, but using
object.point_indices()
andgroupby
allows me to avoid implementing those code paths. It is a "time of development vs. time of execution" problem, and the overall cost to a python user is not significant I think. -
Not sure what you mean.
-
Summing a bool column with 255 values in
cudf
returns anint64
. -
It is lowest in the context of the endpoints, the first point needs to be lexicographically before the last point in order for comparison of all the points via sorting to work. I updated the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move replies to: #926 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to your reply on this comment:
It is lowest in the context of the endpoints, the first point needs to be lexicographically before the last point in order for comparison of all the points via sorting to work. I updated the docs.
Can you elaborate? How does the endpoints lexicographic order affects the equality of two linestrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't! Thanks for calling out another mistake. I had built some test cases around strings of length 2 and ended up believing the data had to do with the start and end of the strings. What really happens is that a linestring is equal to the opposite linestring, which works with sort, but doesn't work with linestrings constructed of the same point in different orders.
The updated code now reverses each linestring within its "bucket" along the GeoArrow x/y buffer, then does a forward and a reverse comparison and or
s them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few more questions. It's definitely getting closer.
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binpreds.py
Outdated
Show resolved
Hide resolved
|
||
def _sort_linestrings(self, lhs, rhs, initial): | ||
"""Swap first and last values of each linestring to ensure that | ||
the first point is the lowest value. This is necessary to ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to your reply on this comment:
It is lowest in the context of the endpoints, the first point needs to be lexicographically before the last point in order for comparison of all the points via sorting to work. I updated the docs.
Can you elaborate? How does the endpoints lexicographic order affects the equality of two linestrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last couple questions.
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binpreds.py
Outdated
Show resolved
Hide resolved
…ires comparing against the linestring and its reverse.
Hope this is coming along to your liking @isVoid. As for more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. There's a typo in the test and 1 question for doc. Approving.
python/cuspatial/cuspatial/tests/binpreds/test_equals_only_binpreds.py
Outdated
Show resolved
Hide resolved
/merge |
* Implement `geom_equals` and binary predicates that depend only on it. (#926) This PR implements binary predicates that depend only on equality, which is implemented here using columnar comparison in python. I'm playing with benchmarks of this feature now. On only Point geometries, we begin to outperform geopandas at 50k points, with 60x performance at 10m points. Authors: - H. Thomson Comer (https://github.com/thomcom) Approvers: - Michael Wang (https://github.com/isVoid) URL: #926 * Add python API `pairwise_point_polygon_distance` (#988) This PR closes #756 , add `pairwise_point_polygon_distance` for python. Depend on #984 #976 Authors: - Michael Wang (https://github.com/isVoid) Approvers: - H. Thomson Comer (https://github.com/thomcom) URL: #988 * Add `dependency-file-generator` as `pre-commit` hook (#1008) Similarly to these [cudf](rapidsai/cudf#12819) and [cuml](rapidsai/cuml#5246) PRs, this PR adds an entry to `.pre-commit-config.yaml` to run the [dependency-file-generator](https://github.com/rapidsai/dependency-file-generator). It also adds an argument to the `rapidsai/shared-action-workflows/.github/workflows/checks.yaml` shared workflow to disable the `dependency-file-generator` from running in that shared workflow. This avoids having the `dependency-file-generator` run in two places since pre-commit is run in CI [here](https://github.com/rapidsai/cuspatial/blob/branch-23.04/ci/check_style.sh#L23). Authors: - AJ Schmidt (https://github.com/ajschmidt8) Approvers: - Ray Douglass (https://github.com/raydouglass) URL: #1008 * Add ZipCode Counting Notebook (#919) This PR adds a notebook that demonstrate the use of quadtree PiP with a custom `QuadTree` structure and joins dataframe. We can use this PR to discuss the possibility of adding the custom structure to the codebase. This also updates the dependency list to include notebook environment in "all" targeted conda envs. Authors: - Michael Wang (https://github.com/isVoid) Approvers: - H. Thomson Comer (https://github.com/thomcom) - Mark Harris (https://github.com/harrism) - AJ Schmidt (https://github.com/ajschmidt8) URL: #919 * Header-only `quadtree_point_in_polygon` (#979) Closes #985 Also contains cleanup of docs for other spatial join functions, correct ordering of stream and MR parameters, and adds missing C++17 property from tests cmake configuration. Authors: - Mark Harris (https://github.com/harrism) Approvers: - Michael Wang (https://github.com/isVoid) - Paul Taylor (https://github.com/trxcllnt) URL: #979 * Reduce gtest times (#1018) Fixes #1017. Reduces C++ gtest total time (on my PC) from 47.9 seconds to 20.08 seconds. Several tests were running large datasets and combinations of size parameters that would be better to run as benchmarks rather than gtests. Reducing these by a factor of 10-100 saves a lot of development time and still exercises the code. In the case of `HausdorffTest/1.10000Spaces10Points (4850 ms)`, reducing it to 1000 spaces, 10 points reduced the time by nearly 100x, likely because it's $O(N^2)$. I modified any test that used close to 1s or more total time, since most column-API tests use under that, and most header-only tests use under 0.2s. | Test | Time Before (s) | Time After (s) | Speedup | |---|---|---|---| | DERIVE_TRAJECTORIES_TEST_EXP | 14.49 | 0.27 | 53.7x | | HAUSDORFF_TEST_EXP | 9.21 | 0.26 | 35.4x | | UTILITY_TEST | 1.86 | 0.30 | 6.2x | | POINT_BOUNDING_BOXES_TEST_EXP | 1.35 | 0.15 | 9x | | TRAJECTORY_DISTANCES_AND_SPEEDS_TEST_EXP | 0.80 | 0.13 | 6.2x | | TOTAL | 47.9 | 20.08 | 2.4x | Before: ``` (rapids) coder ➜ ~/cuspatial/cpp/build/release $ ninja test [0/1] Running tests... Test project /home/coder/cuspatial/cpp/build/release Start 1: SINUSOIDAL_PROJECTION_TEST 1/45 Test #1: SINUSOIDAL_PROJECTION_TEST ................. Passed 0.81 sec Start 2: HAVERSINE_TEST 2/45 Test #2: HAVERSINE_TEST ............................. Passed 0.77 sec Start 3: HAUSDORFF_TEST 3/45 Test #3: HAUSDORFF_TEST ............................. Passed 0.75 sec Start 4: JOIN_POINT_TO_LINESTRING_SMALL_TEST 4/45 Test #4: JOIN_POINT_TO_LINESTRING_SMALL_TEST ........ Passed 0.73 sec Start 5: JOIN_POINT_IN_POLYGON_TEST 5/45 Test #5: JOIN_POINT_IN_POLYGON_TEST ................. Passed 0.79 sec Start 6: POINT_IN_POLYGON_TEST 6/45 Test #6: POINT_IN_POLYGON_TEST ...................... Passed 0.80 sec Start 7: PAIRWISE_POINT_IN_POLYGON_TEST 7/45 Test #7: PAIRWISE_POINT_IN_POLYGON_TEST ............. Passed 0.76 sec Start 8: POINT_QUADTREE_TEST 8/45 Test #8: POINT_QUADTREE_TEST ........................ Passed 0.76 sec Start 9: LINESTRING_BOUNDING_BOXES_TEST 9/45 Test #9: LINESTRING_BOUNDING_BOXES_TEST ............. Passed 0.76 sec Start 10: POLYGON_BOUNDING_BOXES_TEST 10/45 Test #10: POLYGON_BOUNDING_BOXES_TEST ................ Passed 0.80 sec Start 11: POINT_DISTANCE_TEST 11/45 Test #11: POINT_DISTANCE_TEST ........................ Passed 0.79 sec Start 12: POINT_LINESTRING_DISTANCE_TEST 12/45 Test #12: POINT_LINESTRING_DISTANCE_TEST ............. Passed 0.78 sec Start 13: LINESTRING_DISTANCE_TEST 13/45 Test #13: LINESTRING_DISTANCE_TEST ................... Passed 0.78 sec Start 14: POINT_POLYGON_DISTANCE_TEST 14/45 Test #14: POINT_POLYGON_DISTANCE_TEST ................ Passed 0.76 sec Start 15: LINESTRING_INTERSECTION_TEST 15/45 Test #15: LINESTRING_INTERSECTION_TEST ............... Passed 0.83 sec Start 16: POINT_LINESTRING_NEAREST_POINT_TEST 16/45 Test #16: POINT_LINESTRING_NEAREST_POINT_TEST ........ Passed 0.77 sec Start 17: QUADTREE_POLYGON_FILTERING_TEST 17/45 Test #17: QUADTREE_POLYGON_FILTERING_TEST ............ Passed 0.79 sec Start 18: QUADTREE_LINESTRING_FILTERING_TEST 18/45 Test #18: QUADTREE_LINESTRING_FILTERING_TEST ......... Passed 0.76 sec Start 19: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST 19/45 Test #19: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST ....... Passed 0.79 sec Start 20: DERIVE_TRAJECTORIES_TEST 20/45 Test #20: DERIVE_TRAJECTORIES_TEST ................... Passed 0.76 sec Start 21: TRAJECTORY_BOUNDING_BOXES_TEST 21/45 Test #21: TRAJECTORY_BOUNDING_BOXES_TEST ............. Passed 0.75 sec Start 22: SPATIAL_WINDOW_POINT_TEST 22/45 Test #22: SPATIAL_WINDOW_POINT_TEST .................. Passed 0.75 sec Start 23: UTILITY_TEST 23/45 Test #23: UTILITY_TEST ............................... Passed 1.86 sec Start 24: HAVERSINE_TEST_EXP 24/45 Test #24: HAVERSINE_TEST_EXP ......................... Passed 0.14 sec Start 25: POINT_DISTANCE_TEST_EXP 25/45 Test #25: POINT_DISTANCE_TEST_EXP .................... Passed 0.11 sec Start 26: POINT_LINESTRING_DISTANCE_TEST_EXP 26/45 Test #26: POINT_LINESTRING_DISTANCE_TEST_EXP ......... Passed 0.11 sec Start 27: POINT_POLYGON_DISTANCE_TEST_EXP 27/45 Test #27: POINT_POLYGON_DISTANCE_TEST_EXP ............ Passed 0.13 sec Start 28: HAUSDORFF_TEST_EXP 28/45 Test #28: HAUSDORFF_TEST_EXP ......................... Passed 9.21 sec Start 29: LINESTRING_DISTANCE_TEST_EXP 29/45 Test #29: LINESTRING_DISTANCE_TEST_EXP ............... Passed 0.17 sec Start 30: LINESTRING_INTERSECTION_TEST_EXP 30/45 Test #30: LINESTRING_INTERSECTION_TEST_EXP ........... Passed 0.19 sec Start 31: POINT_LINESTRING_NEAREST_POINT_TEST_EXP 31/45 Test #31: POINT_LINESTRING_NEAREST_POINT_TEST_EXP .... Passed 0.12 sec Start 32: SINUSOIDAL_PROJECTION_TEST_EXP 32/45 Test #32: SINUSOIDAL_PROJECTION_TEST_EXP ............. Passed 0.12 sec Start 33: POINTS_IN_RANGE_TEST_EXP 33/45 Test #33: POINTS_IN_RANGE_TEST_EXP ................... Passed 0.11 sec Start 34: POINT_IN_POLYGON_TEST_EXP 34/45 Test #34: POINT_IN_POLYGON_TEST_EXP .................. Passed 0.12 sec Start 35: PAIRWISE_POINT_IN_POLYGON_TEST_EXP 35/45 Test #35: PAIRWISE_POINT_IN_POLYGON_TEST_EXP ......... Passed 0.11 sec Start 36: DERIVE_TRAJECTORIES_TEST_EXP 36/45 Test #36: DERIVE_TRAJECTORIES_TEST_EXP ............... Passed 14.49 sec Start 37: POINT_BOUNDING_BOXES_TEST_EXP 37/45 Test #37: POINT_BOUNDING_BOXES_TEST_EXP .............. Passed 1.35 sec Start 38: POLYGON_BOUNDING_BOXES_TEST_EXP 38/45 Test #38: POLYGON_BOUNDING_BOXES_TEST_EXP ............ Passed 0.11 sec Start 39: LINESTRING_BOUNDING_BOXES_TEST_EXP 39/45 Test #39: LINESTRING_BOUNDING_BOXES_TEST_EXP ......... Passed 0.11 sec Start 40: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST_EXP 40/45 Test #40: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST_EXP ... Passed 0.80 sec Start 41: POINT_QUADTREE_TEST_EXP 41/45 Test #41: POINT_QUADTREE_TEST_EXP .................... Passed 0.12 sec Start 42: OPERATOR_TEST_EXP 42/45 Test #42: OPERATOR_TEST_EXP .......................... Passed 0.14 sec Start 43: FIND_TEST_EXP 43/45 Test #43: FIND_TEST_EXP .............................. Passed 0.13 sec Start 44: JOIN_POINT_IN_POLYGON_SMALL_TEST_EXP 44/45 Test #44: JOIN_POINT_IN_POLYGON_SMALL_TEST_EXP ....... Passed 0.11 sec Start 45: JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP 45/45 Test #45: JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP ....... Passed 0.13 sec 100% tests passed, 0 tests failed out of 45 Total Test time (real) = 47.07 sec ``` After: ``` (rapids) coder ➜ ~/cuspatial/cpp/build/release $ ninja test [0/1] Running tests... Test project /home/coder/cuspatial/cpp/build/release Start 1: SINUSOIDAL_PROJECTION_TEST 1/45 Test #1: SINUSOIDAL_PROJECTION_TEST ................. Passed 0.78 sec Start 2: HAVERSINE_TEST 2/45 Test #2: HAVERSINE_TEST ............................. Passed 0.75 sec Start 3: HAUSDORFF_TEST 3/45 Test #3: HAUSDORFF_TEST ............................. Passed 0.74 sec Start 4: JOIN_POINT_TO_LINESTRING_SMALL_TEST 4/45 Test #4: JOIN_POINT_TO_LINESTRING_SMALL_TEST ........ Passed 0.77 sec Start 5: JOIN_POINT_IN_POLYGON_TEST 5/45 Test #5: JOIN_POINT_IN_POLYGON_TEST ................. Passed 0.76 sec Start 6: POINT_IN_POLYGON_TEST 6/45 Test #6: POINT_IN_POLYGON_TEST ...................... Passed 0.78 sec Start 7: PAIRWISE_POINT_IN_POLYGON_TEST 7/45 Test #7: PAIRWISE_POINT_IN_POLYGON_TEST ............. Passed 0.74 sec Start 8: POINT_QUADTREE_TEST 8/45 Test #8: POINT_QUADTREE_TEST ........................ Passed 0.75 sec Start 9: LINESTRING_BOUNDING_BOXES_TEST 9/45 Test #9: LINESTRING_BOUNDING_BOXES_TEST ............. Passed 0.75 sec Start 10: POLYGON_BOUNDING_BOXES_TEST 10/45 Test #10: POLYGON_BOUNDING_BOXES_TEST ................ Passed 0.73 sec Start 11: POINT_DISTANCE_TEST 11/45 Test #11: POINT_DISTANCE_TEST ........................ Passed 0.73 sec Start 12: POINT_LINESTRING_DISTANCE_TEST 12/45 Test #12: POINT_LINESTRING_DISTANCE_TEST ............. Passed 0.74 sec Start 13: LINESTRING_DISTANCE_TEST 13/45 Test #13: LINESTRING_DISTANCE_TEST ................... Passed 0.76 sec Start 14: POINT_POLYGON_DISTANCE_TEST 14/45 Test #14: POINT_POLYGON_DISTANCE_TEST ................ Passed 0.76 sec Start 15: LINESTRING_INTERSECTION_TEST 15/45 Test #15: LINESTRING_INTERSECTION_TEST ............... Passed 0.78 sec Start 16: POINT_LINESTRING_NEAREST_POINT_TEST 16/45 Test #16: POINT_LINESTRING_NEAREST_POINT_TEST ........ Passed 0.77 sec Start 17: QUADTREE_POLYGON_FILTERING_TEST 17/45 Test #17: QUADTREE_POLYGON_FILTERING_TEST ............ Passed 0.75 sec Start 18: QUADTREE_LINESTRING_FILTERING_TEST 18/45 Test #18: QUADTREE_LINESTRING_FILTERING_TEST ......... Passed 0.77 sec Start 19: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST 19/45 Test #19: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST ....... Passed 0.74 sec Start 20: DERIVE_TRAJECTORIES_TEST 20/45 Test #20: DERIVE_TRAJECTORIES_TEST ................... Passed 0.75 sec Start 21: TRAJECTORY_BOUNDING_BOXES_TEST 21/45 Test #21: TRAJECTORY_BOUNDING_BOXES_TEST ............. Passed 0.74 sec Start 22: SPATIAL_WINDOW_POINT_TEST 22/45 Test #22: SPATIAL_WINDOW_POINT_TEST .................. Passed 0.75 sec Start 23: UTILITY_TEST 23/45 Test #23: UTILITY_TEST ............................... Passed 0.30 sec Start 24: HAVERSINE_TEST_EXP 24/45 Test #24: HAVERSINE_TEST_EXP ......................... Passed 0.12 sec Start 25: POINT_DISTANCE_TEST_EXP 25/45 Test #25: POINT_DISTANCE_TEST_EXP .................... Passed 0.12 sec Start 26: POINT_LINESTRING_DISTANCE_TEST_EXP 26/45 Test #26: POINT_LINESTRING_DISTANCE_TEST_EXP ......... Passed 0.12 sec Start 27: POINT_POLYGON_DISTANCE_TEST_EXP 27/45 Test #27: POINT_POLYGON_DISTANCE_TEST_EXP ............ Passed 0.13 sec Start 28: HAUSDORFF_TEST_EXP 28/45 Test #28: HAUSDORFF_TEST_EXP ......................... Passed 0.26 sec Start 29: LINESTRING_DISTANCE_TEST_EXP 29/45 Test #29: LINESTRING_DISTANCE_TEST_EXP ............... Passed 0.14 sec Start 30: LINESTRING_INTERSECTION_TEST_EXP 30/45 Test #30: LINESTRING_INTERSECTION_TEST_EXP ........... Passed 0.19 sec Start 31: POINT_LINESTRING_NEAREST_POINT_TEST_EXP 31/45 Test #31: POINT_LINESTRING_NEAREST_POINT_TEST_EXP .... Passed 0.11 sec Start 32: SINUSOIDAL_PROJECTION_TEST_EXP 32/45 Test #32: SINUSOIDAL_PROJECTION_TEST_EXP ............. Passed 0.11 sec Start 33: POINTS_IN_RANGE_TEST_EXP 33/45 Test #33: POINTS_IN_RANGE_TEST_EXP ................... Passed 0.13 sec Start 34: POINT_IN_POLYGON_TEST_EXP 34/45 Test #34: POINT_IN_POLYGON_TEST_EXP .................. Passed 0.11 sec Start 35: PAIRWISE_POINT_IN_POLYGON_TEST_EXP 35/45 Test #35: PAIRWISE_POINT_IN_POLYGON_TEST_EXP ......... Passed 0.14 sec Start 36: DERIVE_TRAJECTORIES_TEST_EXP 36/45 Test #36: DERIVE_TRAJECTORIES_TEST_EXP ............... Passed 0.27 sec Start 37: POINT_BOUNDING_BOXES_TEST_EXP 37/45 Test #37: POINT_BOUNDING_BOXES_TEST_EXP .............. Passed 0.15 sec Start 38: POLYGON_BOUNDING_BOXES_TEST_EXP 38/45 Test #38: POLYGON_BOUNDING_BOXES_TEST_EXP ............ Passed 0.13 sec Start 39: LINESTRING_BOUNDING_BOXES_TEST_EXP 39/45 Test #39: LINESTRING_BOUNDING_BOXES_TEST_EXP ......... Passed 0.12 sec Start 40: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST_EXP 40/45 Test #40: TRAJECTORY_DISTANCES_AND_SPEEDS_TEST_EXP ... Passed 0.13 sec Start 41: POINT_QUADTREE_TEST_EXP 41/45 Test #41: POINT_QUADTREE_TEST_EXP .................... Passed 0.14 sec Start 42: OPERATOR_TEST_EXP 42/45 Test #42: OPERATOR_TEST_EXP .......................... Passed 0.14 sec Start 43: FIND_TEST_EXP 43/45 Test #43: FIND_TEST_EXP .............................. Passed 0.15 sec Start 44: JOIN_POINT_IN_POLYGON_SMALL_TEST_EXP 44/45 Test #44: JOIN_POINT_IN_POLYGON_SMALL_TEST_EXP ....... Passed 0.12 sec Start 45: JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP 45/45 Test #45: JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP ....... Passed 0.13 sec 100% tests passed, 0 tests failed out of 45 Total Test time (real) = 20.08 sec ``` Authors: - Mark Harris (https://github.com/harrism) Approvers: - Michael Wang (https://github.com/isVoid) - Paul Taylor (https://github.com/trxcllnt) URL: #1018 * Re-add enabled_check_generated_files:false --------- Co-authored-by: H. Thomson Comer <thomcom@gmail.com> Co-authored-by: Michael Wang <isVoid@users.noreply.github.com> Co-authored-by: AJ Schmidt <ajschmidt8@users.noreply.github.com>
Closes #838
Closes #832
Depends on http://github.com/rapidsai/cudf/pull/12749
CI failure in this PR is caused by the above bug in CUDF.
Description
This PR implements binary predicates that depend only on equality, which is implemented here using columnar comparison in python.
I'm playing with benchmarks of this feature now. On only Point geometries, we begin to outperform geopandas at 50k points, with 60x performance at 10m points.
Checklist