Skip to content

Conversation

@chay0112
Copy link
Contributor

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

  • Implement force_3d

How was this patch tested?

  • Included unit and parity tests

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation.

Copy link
Collaborator

@petern48 petern48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just some minor feedback about improving the docs and the tests. Should be good to merge afterwards.

Comment on lines 972 to 973
>>> from shapely import Polygon, LineString, Point
>>> s = geopandas.GeoSeries(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> from shapely import Polygon, LineString, Point
>>> s = geopandas.GeoSeries(
>>> from shapely import Polygon, LineString, Point
>>> from sedona.spark.geopandas import GeoSeries
>>> s = GeoSeries(

nit

Comment on lines +955 to +956
2D geometries will get the provided Z coordinate; 3D geometries
are unchanged (unless their Z coordinate is ``np.nan``).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for this in test_geoseries.py? I see that this is from the original geopandas docstring, so I'm not entirely sure if we follow the same behavior naturally in Sedona. If we don't let me know, and we can figure out what to do next.

We could test something like whether Point(1, 1, np.nan) becomes Point(1, 1, 3) when calling .force_3d(3)

Comment on lines 958 to 960
Note that for empty geometries, 3D is only supported since GEOS 3.9 and then
still only for simple geometries (non-collections).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that for empty geometries, 3D is only supported since GEOS 3.9 and then
still only for simple geometries (non-collections).

I think we can delete this part. The GEOS 3.9 part doesn't apply to use because we're built not on GEOS (which is a C library). And I think we do support collections since the test_match_geopandas_series test passed naturally (self.geoms in that test contains collections). Could you add a single test in test_geoseries.py that tests a GEOMETRYCOLLECTION with a non-zero z-arg?

e.g. GeometryCollection(Point(1, 1), LineString([(1, 1), (0, 1), (1, 0)])) and force_3d(3)

Comment on lines 886 to 887
sgpd_result = GeoSeries(geom).force_3d()
gpd_result = gpd.GeoSeries(geom).force_3d()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sgpd_result = GeoSeries(geom).force_3d()
gpd_result = gpd.GeoSeries(geom).force_3d()
sgpd_result = GeoSeries(geom).force_3d(4)
gpd_result = gpd.GeoSeries(geom).force_3d(4)

This is a minor change, and I don't think Sedona would fail it, but I think this would be a slightly better test if we test a non-zero z argument. These "match" tests in test_match_geopandas_series.py are the most comprehensive/powerful, so I'd prefer to avoid a common default case like 0. e.g. the chances that we have a bug that always set z=0 in some case is much higher than the chance that it always set z=4.

@chay0112
Copy link
Contributor Author

chay0112 commented Nov 20, 2025

Looks good. Just some minor feedback about improving the docs and the tests. Should be good to merge afterwards.

@petern48 Thanks Peter, Great insights as always. Just addressed them.

Comment on lines 1584 to 1585
LineString([(1, 1, 2), (0, 1, 2), (1, 0, 2)]),
Polygon([(0, 0, 3), (0, 10, 3), (10, 10, 3), (0, 0, 3)]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LineString([(1, 1, 2), (0, 1, 2), (1, 0, 2)]),
Polygon([(0, 0, 3), (0, 10, 3), (10, 10, 3), (0, 0, 3)]),
LineString([(1, 1, 3), (0, 1, 3), (1, 0, 3)]),
Polygon([(0, 0, 4), (0, 10, 4), (10, 10, 4), (0, 0, 4)]),

I noticed our tests here were slightly buggy. Here's what the expected results should look like. These geoms correspond to the 3 and 4 in z = [0, 2, 2, 3, 4, 5]. What's concerning is that the tests pass anyway. Experimenting locally, I noticed we can change the z-coordinate however we like, and the tests still pass. I think what's happening is that the check_sgpd_equals_gpd() doesn't yet check anything past the x and y dimensions...

(sigh) Unfortunately, what this means is that we'll need to modify the test code to test this properly before we can merge this PR. I filed an issue for this: #2513. You're welcome to try taking a stab at it or just move on to something else (while keeping this PR open). My time is limited, so I wouldn't be able to investigate it for at least a few days. I don't love having to put a hold on merging your great work here, but this is just how it is sometimes. Sorry about this :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, Thanks for the insights Peter. No problem, I'll move on to a new one.

@petern48
Copy link
Collaborator

Alright @chay0112, the fix has been merged. Think this should be ready once that last suggestion is accepted.

@chay0112
Copy link
Contributor Author

Alright @chay0112, the fix has been merged. Think this should be ready once that last suggestion is accepted.

Thanks for your fix Peter, I believe I need to rephrase my tests in test_match_geopandas_series.py because things like Line Strings to Linear Rings and Polygons to Multipolygons would fail

@chay0112
Copy link
Contributor Author

@petern48 I made some adjustments to the tests in test_match_geopandas_series.py because the previous assertions were failing for M and ZM geometry cases. Please review the changes and let me know if there’s a better or cleaner approach. Feedback is welcome.

@petern48
Copy link
Collaborator

petern48 commented Nov 25, 2025

Responding to the deletions in this commit: f90875a. Here are the failures from the test_match_geopandas tests (I ran locally). Next time, ideally report back when you see behavior like this, instead of relying on me to investigate myself (sometimes, it may take longer that way). Sometimes, it's reasonable to skip; sometimes it's major enough to just put a halt to it. Alright let's see.

Error for LinearRing:

ValueError: Geometry equality check failed for LINESTRING Z (0 0 4, 1 0 4, 1 1 4, 0 1 4, 0 0 4) and LINEARRING Z (0 0 4, 1 0 4, 1 1 4, 0 1 4, 0 0 4)

Sedona is returning the equivalent of a LineString. Sedona doesn't support LinearRings since LineStrings are enough. Yep this is fine to skip ✅.

Error for MultiPolygon:

ValueError: Geometry equality check failed for POLYGON Z ((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4)) and MULTIPOLYGON Z (((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))

Ah interesting, it's the same geometry, but we return polygon instead of multipolygon. This is a minor bug in Sedona we found. Ideally, we want to return MultiPolygon. I dug in real quick and submitted a fix for this: #2526

Trying it on PostGIS, I see returning MultiPolygon is the desired behavior as well

SELECT ST_AsText(ST_Force3D(ST_GeomFromText('MULTIPOLYGON(((0 0, 0 1, 1 0, 0 0), (0.1 0.1, 0.1 0.2, 0.2 0.1, 0.1 0.1)))'), 4));
-- MULTIPOLYGON Z (((0 0 4,0 1 4,1 0 4,0 0 4),(0.1 0.1 4,0.1 0.2 4,0.2 0.1 4,0.1 0.1 4)))

Error for GeometryCollection

ValueError: Geometry equality check failed for GEOMETRYCOLLECTION Z (MULTIPOINT Z ((0 0 4), (1 1 4)), MULTILINESTRING Z ((0 0 4, 1 1 4), (2 2 4, 3 3 4)), POLYGON Z ((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))) and GEOMETRYCOLLECTION Z (MULTIPOINT Z ((0 0 4), (1 1 4)), MULTILINESTRING Z ((0 0 4, 1 1 4), (2 2 4, 3 3 4)), MULTIPOLYGON Z (((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))))

Fails for the same reason as multipolygon (since it has a multipolygon inside of it).

Error for M coordinates. Hmm, yeah this one is interesting. Give me a bit more time to figure out what to do here.

# For POINT M (1 2 3) input
ValueError: Geometry equality check failed for POINT Z (1 2 7.5) and POINT M (1 2 3)

# For POINT ZM (1 2 3 4) input
ValueError: Geometry equality check failed for POINT Z (1 2 3) and POINT Z (1 2 7.5)

@chay0112
Copy link
Contributor Author

Responding to the deletions in this commit: f90875a. Here are the failures from the test_match_geopandas tests (I ran locally). Next time, ideally report back when you see behavior like this, instead of relying on me to investigate myself (sometimes, it may take longer that way). Sometimes, it's reasonable to skip; sometimes it's major enough to just put a halt to it. Alright let's see.

Error for LinearRing:

ValueError: Geometry equality check failed for LINESTRING Z (0 0 4, 1 0 4, 1 1 4, 0 1 4, 0 0 4) and LINEARRING Z (0 0 4, 1 0 4, 1 1 4, 0 1 4, 0 0 4)

Sedona is returning the equivalent of a LineString. Sedona doesn't support LinearRings since LineStrings are enough. Yep this is fine to skip ✅.

Error for MultiPolygon:

ValueError: Geometry equality check failed for POLYGON Z ((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4)) and MULTIPOLYGON Z (((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))

Ah interesting, it's the same geometry, but we return polygon instead of multipolygon. This is a minor bug in Sedona we found. Ideally, we want to return MultiPolygon. I dug in real quick and submitted a fix for this: #2526

Trying it on PostGIS, I see returning MultiPolygon is the desired behavior as well

SELECT ST_AsText(ST_Force3D(ST_GeomFromText('MULTIPOLYGON(((0 0, 0 1, 1 0, 0 0), (0.1 0.1, 0.1 0.2, 0.2 0.1, 0.1 0.1)))'), 4));
-- MULTIPOLYGON Z (((0 0 4,0 1 4,1 0 4,0 0 4),(0.1 0.1 4,0.1 0.2 4,0.2 0.1 4,0.1 0.1 4)))

Error for GeometryCollection

ValueError: Geometry equality check failed for GEOMETRYCOLLECTION Z (MULTIPOINT Z ((0 0 4), (1 1 4)), MULTILINESTRING Z ((0 0 4, 1 1 4), (2 2 4, 3 3 4)), POLYGON Z ((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))) and GEOMETRYCOLLECTION Z (MULTIPOINT Z ((0 0 4), (1 1 4)), MULTILINESTRING Z ((0 0 4, 1 1 4), (2 2 4, 3 3 4)), MULTIPOLYGON Z (((0 0 4, 0 1 4, 1 0 4, 0 0 4), (0.1 0.1 4, 0.1 0.2 4, 0.2 0.1 4, 0.1 0.1 4))))

Fails for the same reason as multipolygon (since it has a multipolygon inside of it).

Error for M coordinates. Hmm, yeah this one is interesting. Give me a bit more time to figure out what to do here.

# For POINT M (1 2 3) input
ValueError: Geometry equality check failed for POINT Z (1 2 7.5) and POINT M (1 2 3)

# For POINT ZM (1 2 3 4) input
ValueError: Geometry equality check failed for POINT Z (1 2 3) and POINT Z (1 2 7.5)

@petern48 Sure, I'll report the anomalies from next time, Thanks for a very clear explanation. Take your time. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Geopandas: Implement force_3d

2 participants