-
Notifications
You must be signed in to change notification settings - Fork 744
[GH-2504] Implement force_3d #2512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
petern48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just some minor feedback about improving the docs and the tests. Should be good to merge afterwards.
| >>> from shapely import Polygon, LineString, Point | ||
| >>> s = geopandas.GeoSeries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| >>> from shapely import Polygon, LineString, Point | |
| >>> s = geopandas.GeoSeries( | |
| >>> from shapely import Polygon, LineString, Point | |
| >>> from sedona.spark.geopandas import GeoSeries | |
| >>> s = GeoSeries( |
nit
| 2D geometries will get the provided Z coordinate; 3D geometries | ||
| are unchanged (unless their Z coordinate is ``np.nan``). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for this in test_geoseries.py? I see that this is from the original geopandas docstring, so I'm not entirely sure if we follow the same behavior naturally in Sedona. If we don't let me know, and we can figure out what to do next.
We could test something like whether Point(1, 1, np.nan) becomes Point(1, 1, 3) when calling .force_3d(3)
| Note that for empty geometries, 3D is only supported since GEOS 3.9 and then | ||
| still only for simple geometries (non-collections). | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Note that for empty geometries, 3D is only supported since GEOS 3.9 and then | |
| still only for simple geometries (non-collections). |
I think we can delete this part. The GEOS 3.9 part doesn't apply to use because we're built not on GEOS (which is a C library). And I think we do support collections since the test_match_geopandas_series test passed naturally (self.geoms in that test contains collections). Could you add a single test in test_geoseries.py that tests a GEOMETRYCOLLECTION with a non-zero z-arg?
e.g. GeometryCollection(Point(1, 1), LineString([(1, 1), (0, 1), (1, 0)])) and force_3d(3)
| sgpd_result = GeoSeries(geom).force_3d() | ||
| gpd_result = gpd.GeoSeries(geom).force_3d() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| sgpd_result = GeoSeries(geom).force_3d() | |
| gpd_result = gpd.GeoSeries(geom).force_3d() | |
| sgpd_result = GeoSeries(geom).force_3d(4) | |
| gpd_result = gpd.GeoSeries(geom).force_3d(4) |
This is a minor change, and I don't think Sedona would fail it, but I think this would be a slightly better test if we test a non-zero z argument. These "match" tests in test_match_geopandas_series.py are the most comprehensive/powerful, so I'd prefer to avoid a common default case like 0. e.g. the chances that we have a bug that always set z=0 in some case is much higher than the chance that it always set z=4.
@petern48 Thanks Peter, Great insights as always. Just addressed them. |
| LineString([(1, 1, 2), (0, 1, 2), (1, 0, 2)]), | ||
| Polygon([(0, 0, 3), (0, 10, 3), (10, 10, 3), (0, 0, 3)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| LineString([(1, 1, 2), (0, 1, 2), (1, 0, 2)]), | |
| Polygon([(0, 0, 3), (0, 10, 3), (10, 10, 3), (0, 0, 3)]), | |
| LineString([(1, 1, 3), (0, 1, 3), (1, 0, 3)]), | |
| Polygon([(0, 0, 4), (0, 10, 4), (10, 10, 4), (0, 0, 4)]), |
I noticed our tests here were slightly buggy. Here's what the expected results should look like. These geoms correspond to the 3 and 4 in z = [0, 2, 2, 3, 4, 5]. What's concerning is that the tests pass anyway. Experimenting locally, I noticed we can change the z-coordinate however we like, and the tests still pass. I think what's happening is that the check_sgpd_equals_gpd() doesn't yet check anything past the x and y dimensions...
(sigh) Unfortunately, what this means is that we'll need to modify the test code to test this properly before we can merge this PR. I filed an issue for this: #2513. You're welcome to try taking a stab at it or just move on to something else (while keeping this PR open). My time is limited, so I wouldn't be able to investigate it for at least a few days. I don't love having to put a hold on merging your great work here, but this is just how it is sometimes. Sorry about this :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, Thanks for the insights Peter. No problem, I'll move on to a new one.
9fafeb5 to
f90875a
Compare
|
@petern48 I made some adjustments to the tests in test_match_geopandas_series.py because the previous assertions were failing for M and ZM geometry cases. Please review the changes and let me know if there’s a better or cleaner approach. Feedback is welcome. |
|
Responding to the deletions in this commit: f90875a. Here are the failures from the test_match_geopandas tests (I ran locally). Next time, ideally report back when you see behavior like this, instead of relying on me to investigate myself (sometimes, it may take longer that way). Sometimes, it's reasonable to skip; sometimes it's major enough to just put a halt to it. Alright let's see. Error for LinearRing: Sedona is returning the equivalent of a LineString. Sedona doesn't support LinearRings since LineStrings are enough. Yep this is fine to skip ✅. Error for MultiPolygon: Ah interesting, it's the same geometry, but we return polygon instead of multipolygon. This is a minor bug in Sedona we found. Ideally, we want to return MultiPolygon. I dug in real quick and submitted a fix for this: #2526 Trying it on PostGIS, I see returning MultiPolygon is the desired behavior as well Error for GeometryCollection Fails for the same reason as multipolygon (since it has a multipolygon inside of it). Error for M coordinates. Hmm, yeah this one is interesting. Give me a bit more time to figure out what to do here. |
@petern48 Sure, I'll report the anomalies from next time, Thanks for a very clear explanation. Take your time. Thanks |
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Geopandas: Implement force_3d #2504What changes were proposed in this PR?
How was this patch tested?
Did this PR include necessary documentation updates?