Sort the mergeable segments before computing merged segments #1207

isVoid · 2023-06-29T19:25:41Z

Description

This PR is part-1 fix to #1200. In find_and_combine_segments the N^2 algorithm depends on the fact that the API needs to be presorted with certain criteria. This Pr adds such sorting.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

cpp/include/cuspatial/detail/find/find_and_combine_segment.cuh

cpp/include/cuspatial/detail/intersection/linestring_intersection.cuh

cpp/include/cuspatial/detail/intersection/linestring_intersection_with_duplicates.cuh

cpp/tests/intersection/linestring_intersection_large_test.cu

harrism

Looks good. Just a few suggestions and a question about perf impact.

cpp/include/cuspatial/detail/find/find_and_combine_segment.cuh

cpp/include/cuspatial/geometry/segment.cuh

cpp/tests/intersection/linestring_intersection_large_test.cu

…into fix/1200

harrism · 2023-07-17T23:20:52Z

cpp/include/cuspatial/detail/find/find_and_combine_segment.cuh

 #include <thrust/uninitialized_fill.h>

 namespace cuspatial {
 namespace detail {

 /**
 * @internal
- * @brief Kernel to merge segments, naive n^2 algorithm.
+ * @brief Kernel to merge segments. Each thread works on one segment space,
+ * with a naive n^2 algorithm.


Perhaps naive is the wrong word. "brute force"? Is lower complexity a possibility?

Great question. What's the lower bound of the algorithm complexity here? Given a set of segments, to merge all mergeable segments, one at least needs to look at all segments once, so the lower bound is O(N) (N being the number of segments in the set). To achieve that, I assume you need to design some sort of hashing function. When you look at one segment, the hashing function will compute the key and look into the map to find if there is existing segment that can be merged with it. This key would be similar to what we designed in the sorting comparator here - a combination of group id, slope and lower left of the segment. A map on CPU code is simple, but may be quite difficult to implement on device and could invoke unwanted slow down.

So that said, the current algorithm isn't naive (or brute force) at all - while there is a nested loop here, the sorting actually already did part of the hashing work like above. The sorting makes sure that the outer loop only run against all other segments once in every mergeable group, all subsequent segments are marked merged_flag == 1 after the initial run. So the nested loop here is actually on the order of O(N) here. The aggregated complexity is dominated by the sorting, which is O(NlogN).

harrism

Looks good, thanks!

isVoid · 2023-07-25T16:14:58Z

/merge

EDIT: the thrust bug is a known issue. Tracked by NVIDIA/cccl#153 This PR fixes an issue in `remove_if`. Strangely, when calling `reduce_by_key` with iterator to integer, and when using `thrust::plus<index_t>()`, even if by definition the argument type of the plus operator is strongly typed with a higher bit width integer type, and I expected that the flags (`uint8_t`) were cast to the higher bit depth before addition, the overflow still happens. I have filed a thread in the thrust channel to discuss if this is a bug in thrust. Meanwhile, a quick WAR is to explicitly use a transform iterator to cast the uchar in to `index_t` before adding. This should give the correct result. Fixes #1200 Depend on #1207 Authors: - Michael Wang (https://github.com/isVoid) - H. Thomson Comer (https://github.com/thomcom) Approvers: - Mark Harris (https://github.com/harrism) - H. Thomson Comer (https://github.com/thomcom) URL: #1209

isVoid added 5 commits June 28, 2023 17:11

add long input test case

545344b

sort the segments before merging

cbd803f

add twospaces find and combine test

04d551e

add twospaces, non-contiguous input

bbc2dce

Remove stale includes

83d19af

isVoid requested a review from a team as a code owner June 29, 2023 19:25

isVoid requested review from trxcllnt and harrism June 29, 2023 19:25

github-actions bot added the libcuspatial Relates to the cuSpatial C++ library label Jun 29, 2023

isVoid added 2 commits June 29, 2023 12:27

add documentation

4a822e5

add test for overlapping and non-contiguous inputs

19fc646

isVoid commented Jun 30, 2023

View reviewed changes

isVoid added 3 commits June 29, 2023 17:52

Remove stale debug prints

f000007

Update documentation

3ec5634

remove large intersection test

8713c49

isVoid added bug Something isn't working non-breaking Non-breaking change labels Jun 30, 2023

isVoid mentioned this pull request Jun 30, 2023

Fix overflowing in intersection_intermediates.remove_if #1209

Merged

3 tasks

harrism requested changes Jul 4, 2023

View reviewed changes

applied review comments

bac1a36

isVoid requested a review from a team as a code owner July 14, 2023 04:58

github-actions bot added the Python Related to Python code label Jul 14, 2023

isVoid added 3 commits July 14, 2023 06:47

Add streams to allocate_like call

03b8338

Add streams to allocate_like call

571318c

address review changes

76d066e

isVoid requested a review from harrism July 14, 2023 07:43

isVoid added 3 commits July 16, 2023 20:24

Merge branch 'branch-23.08' of https://github.com/rapidsai/cuspatial …

6d0ae8b

…into fix/1200

Merge branch 'fix/allocate_like' into fix/1200

a3fc039

fix collinear test bug

5478f37

thomcom mentioned this pull request Jul 17, 2023

MultiGeometry binary predicate support. #1220

Draft

3 tasks

harrism reviewed Jul 17, 2023

View reviewed changes

harrism approved these changes Jul 17, 2023

View reviewed changes

thomcom approved these changes Jul 21, 2023

View reviewed changes

isVoid added 5 commits July 25, 2023 09:30

style

37d3808

Add comments to explain algorithm complexity

5f3f9e1

Merge branch 'branch-23.08' into fix/1200

c860c57

Change test case result order due to sorting

6f13294

Merge branch 'fix/1200' of github.com:isVoid/cuspatial into fix/1200

6b82b34

rapids-bot bot merged commit 8ee773c into rapidsai:branch-23.08 Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort the mergeable segments before computing merged segments #1207

Sort the mergeable segments before computing merged segments #1207

isVoid commented Jun 29, 2023 •

edited

Loading

harrism left a comment

harrism Jul 17, 2023

isVoid Jul 25, 2023 •

edited

Loading

harrism left a comment

isVoid commented Jul 25, 2023

Sort the mergeable segments before computing merged segments #1207

Sort the mergeable segments before computing merged segments #1207

Conversation

isVoid commented Jun 29, 2023 • edited Loading

Description

Checklist

harrism left a comment

Choose a reason for hiding this comment

harrism Jul 17, 2023

Choose a reason for hiding this comment

isVoid Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

harrism left a comment

Choose a reason for hiding this comment

isVoid commented Jul 25, 2023

isVoid commented Jun 29, 2023 •

edited

Loading

isVoid Jul 25, 2023 •

edited

Loading