Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add geometry_generator factory for programmatic generation of geometry arrays #998

Merged

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Mar 17, 2023

Description

To allow easy benchmarking for future APIs, I created geometry_generator that programmatically generate a geoarrow compliant geometry arrays. The first feature landing is generate_multipolygon_array.

This PR also adds BaseFixtureWithParam, which allows user to parameterize the tests with value and ranges.

Contributes to #259

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library labels Mar 17, 2023
@isVoid isVoid self-assigned this Mar 17, 2023
@isVoid isVoid marked this pull request as ready for review March 20, 2023 05:28
@isVoid isVoid requested review from a team as code owners March 20, 2023 05:28
@isVoid isVoid requested review from harrism and trxcllnt March 20, 2023 05:28
@isVoid isVoid changed the title [skip-ci] Add geometry_generator factory for programmatic generation of geometry arrays Add geometry_generator factory for programmatic generation of geometry arrays Mar 20, 2023
@isVoid isVoid requested a review from thomcom March 20, 2023 05:28
@isVoid isVoid added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change feature request New feature or request and removed improvement Improvement / enhancement to an existing function labels Mar 20, 2023
@isVoid isVoid mentioned this pull request Mar 20, 2023
3 tasks
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Thanks for this utility!

cpp/include/cuspatial_test/base_fixture.hpp Outdated Show resolved Hide resolved
cpp/include/cuspatial_test/geometry_generator.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial_test/geometry_generator.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial_test/geometry_generator.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial_test/geometry_generator.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial_test/geometry_generator.cuh Outdated Show resolved Hide resolved
cpp/tests/utility_test/test_float_equivalent.cu Outdated Show resolved Hide resolved
@isVoid
Copy link
Contributor Author

isVoid commented Mar 22, 2023

/merge

@rapids-bot rapids-bot bot merged commit 515e0c4 into rapidsai:branch-23.04 Mar 22, 2023
rapids-bot bot pushed a commit that referenced this pull request May 23, 2023
This PR separates the `pairwise_point_polygon_distance` benchmark portion of PR #1002. While that PR is only left for nvtx3 experiments.

# Original PR description:

This PR adds pairwise point polygon distance benchmark. Depends on #998

Point-polygon distance performance can be affected by many factors, because the geometry is complex in nature. I benchmarked these questions:
1. How does the algorithm scales with simple multipolygons?
2. How does it scales with complex multipolygons?

## How does the algorithm scales with simple multipolygons?
The benchmark uses the most simple multipolygon, 3 sides per polygon, 0 hole and 1 polygon per multipolygon.

Float32
| Num   multipolygon | Throughput   (#multipolygons / s) |
| --- | --- |
| 1 | 28060.32971 |
| 100 | 2552687.469 |
| 10000 | 186044781 |
| 1000000 | 1047783101 |
| 100000000 | 929537385.2 | 

Float64
| Num   multipolygon | Throughput   (#multipolygons / s) |
| --- | --- |
| 1 | 28296.94817 |
| 100 | 2491541.218 |
| 10000 | 179379919.5 |
| 1000000 | 854678939.9 |
| 100000000 | 783364410.7 |

![image](https://user-images.githubusercontent.com/13521008/226502300-c3273d80-5f9f-4d53-b961-a24e64216e9b.png)

The chart shows that with simple polygons and simple multipoint (1 point per multipoint), the algorithm scales pretty nicely. Throughput is maxed out at near 1M pairs.

## How does the algorithm scales with complex multipolygons?

The benchmark uses a complex multipolygon, 100 edges per ring, 10 holes per polygon and 3 polygons per multipolygon.

float32
Num   multipolygon | Throughput   (#multipolygons / s)
-- | --
1000 | 158713.2377
10000 | 345694.2642
100000 | 382849.058

float64
Num   multipolygon | Throughput   (#multipolygons / s)
-- | --
1000 | 148727.1246
10000 | 353141.9758
100000 | 386007.3016

![image](https://user-images.githubusercontent.com/13521008/226502732-0d116db7-6257-4dec-b170-c42b30df9cea.png)

The algorithm reaches max throughput at near 10K pairs. About 100X lower than the simple multipolygon example.

Authors:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: #1131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmake Related to CMake code or build configuration feature request New feature or request libcuspatial Relates to the cuSpatial C++ library non-breaking Non-breaking change
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

2 participants