Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create pairwise_point_in_polygon to be used by pairwise GeoSeries #750

Merged

Conversation

thomcom
Copy link
Contributor

@thomcom thomcom commented Oct 24, 2022

Description

This PR provides two primary outcomes: it adds a pairwise_point_in_polygon API in the header-only and cudf-backed C++ functionality, and it modifies the is_point_in_polygon kernel to perform boundary exclusion.

The pairwise_point_in_polygon API has been added because GeoPandas.GeoSeries.contains only supports a pairwise operation and a many-to-one operation, but does not implement the all-pairs approach to point-in-polygon we developed.

Boundary exclusion has been added for two reasons: The main reason is because GeoPandas.GeoSeries.contains applies boundary exclusion: points that are in the boundary of a polygon are not considered to be "in" the polygon. The other reason to add boundary exclusion is because our results were previously undefined. Some boundary points were excluded in point-in-polygon, and others were not. This PR resolves that inconsistency.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@thomcom thomcom requested review from a team as code owners October 24, 2022 17:39
@github-actions github-actions bot added cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library Python Related to Python code labels Oct 24, 2022
@thomcom thomcom self-assigned this Oct 24, 2022
@thomcom thomcom added 4 - Needs Reviewer Waiting for reviewer to review or respond feature request New feature or request non-breaking Non-breaking change labels Oct 24, 2022
@thomcom thomcom added this to the DE-9IM milestone Oct 24, 2022
Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a long input test to have more point/polygon points than the size of a grid.

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need an epsilon when comparing to zero...

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the experimental epsilon code with printfs and my question about the utility of pairwise PiP for big data, looks good.

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last few comments

@harrism
Copy link
Member

harrism commented Nov 7, 2022

rerun tests

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above, adopt if you see fit.

@harrism
Copy link
Member

harrism commented Nov 29, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 924b570 into rapidsai:branch-22.12 Nov 29, 2022
rapids-bot bot pushed a commit that referenced this pull request Nov 30, 2022
…749)

This PR closes the above named issues relating to creating a .contains method and, more importantly, resolving boundary case inconsistency with `point_in_polygon`.

~As it stands the colinearity test I've added to `is_point_in_polygon` doubles the runtime of brute-force `point_in_polygon` and has no visible effect on the runtime of `quadtree_point_in_polygon`.~

~- Note I need to double check the above benchmark, having set this project down for the last few weeks.~

This depends on #750, please do not review the C++ code here until that PR is merged. Please do review the python code.

## Benchmark

Benchmark results are in, looks like there's no measurable speed difference between 22.12 pre-boundary exclusion and our current implementation:

```
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================================== test session starts ===================================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                                                         

api/bench_api.py .                                                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9636  1.9749  1.9678  0.0043  1.9660  0.0045       1;0  0.5082       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================================== 1 passed in 16.28s ===================================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch feature/GeoSeries.contains
```
vs `branch-22.12`
```
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================== test session starts ===================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                         

api/bench_api.py .                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9516  1.9843  1.9730  0.0126  1.9760  0.0127       1;0  0.5068       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================== 1 passed in 16.61s ===================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch benchmark/branch-22.12
```

## Still adding:
- [x] Detailed description of xfail result.
- [x] Self-review existing `.contains` implementation in python.
- [x] Update `.contains` docs when necessary.
- [x] Benchmark again and document here.
- [x] Move binops_with_quadtree.py to next branch.
- [x] `.contains` Examples

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

URL: #749
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Reviewer Waiting for reviewer to review or respond cmake Related to CMake code or build configuration feature request New feature or request libcuspatial Relates to the cuSpatial C++ library non-breaking Non-breaking change Python Related to Python code
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Benchmark boundary exclusion [FEA] Determine if point-in-polygon is exterior inclusive or not.
3 participants