-
Notifications
You must be signed in to change notification settings - Fork 31
dev: ST_IsClosed #219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dev: ST_IsClosed #219
Conversation
|
Not completed, Because, I'm not particularly sure how to implement the match for |
Nice work so far! I think the next thing that would be helpful is for you to figure out what PostGIS outputs for these less obvious cases (e.g point, multipoint). Two ways I can think of
docker exec -it sedona-db-postgis-1 psql -U postgres
# I'm like 90% confident the container name of 'sedona-db-postgis-1' should be the same for you |
| geo_traits::GeometryType::MultiLineString(multilinestring) => { | ||
| Ok(multilinestring.to_multi_line_string().is_closed()) | ||
| } | ||
| _ => internal_err!("Invalid geometry type"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _ => internal_err!("Invalid geometry type"), | |
| _ => sedona_internal_err!("Invalid geometry type"), |
minor nit while i'm here (I would've probably forgot later), we should instead use sedona_internal_err! here. You can see this PR for why we introduced that. tldr, it's error msg is misleading. Here's the import path use sedona_common::sedona_internal_err;
|
Hi @petern48 @paleolimbot I'm getting this weird error on the python end, Actually, It was the wild card in the pattern matching being triggered |
petern48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got to be a little more careful for this one since we're implementing some manual logic, but it's looking great. Nearly there.
| @@ -1,88 +1,2 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's undo this change to .gitignore
| geo_traits::GeometryType::Point(_) | ||
| | geo_traits::GeometryType::Line(_) | ||
| | geo_traits::GeometryType::MultiPoint(_) | ||
| | geo_traits::GeometryType::Polygon(_) | ||
| | geo_traits::GeometryType::MultiPolygon(_) | ||
| | geo_traits::GeometryType::Rect(_) | ||
| | geo_traits::GeometryType::Triangle(_) => Ok(true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| geo_traits::GeometryType::Point(_) | |
| | geo_traits::GeometryType::Line(_) | |
| | geo_traits::GeometryType::MultiPoint(_) | |
| | geo_traits::GeometryType::Polygon(_) | |
| | geo_traits::GeometryType::MultiPolygon(_) | |
| | geo_traits::GeometryType::Rect(_) | |
| | geo_traits::GeometryType::Triangle(_) => Ok(true), | |
| geo_traits::GeometryType::Point(_) | |
| | geo_traits::GeometryType::MultiPoint(_) | |
| | geo_traits::GeometryType::Polygon(_) | |
| | geo_traits::GeometryType::MultiPolygon(_) => Ok(true), | |
| _ => sedona_internal_err!("Invalid geometry type"), |
SedonaDB doesn't support Line, Rect, or Triangle geometry types, so I think it's better to omit them and error on the default case (this is what we do in other functions). If we somehow got those geom types, I'd rather not hide that bug.
| is_geometry_closed(item).map_err(|e| { | ||
| datafusion_common::error::DataFusionError::Execution(format!( | ||
| "Failed to check if geometry is closed: {e}" | ||
| )) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| is_geometry_closed(item).map_err(|e| { | |
| datafusion_common::error::DataFusionError::Execution(format!( | |
| "Failed to check if geometry is closed: {e}" | |
| )) | |
| }) | |
| is_geometry_closed(item) |
.map_err() is usually only needed to convert the Error type of the result from one error type to DataFusionError (e.g the is_geometry_empty returns Result<_, SedonaGeometryError>, so we need to call it for that below) In this case, both is_geometry_closed and invoke_scalar return the DataFusionError, so there's no need to convert
| ("LINESTRING(0 0, 0 1, 1 1, 0 0)", True), | ||
| ("MULTILINESTRING((0 0, 0 1, 1 1, 0 0),(0 0, 1 1))", False), | ||
| ("POINT(0 0)", True), | ||
| ("MULTIPOINT((0 0), (1 1))", True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ("MULTIPOINT((0 0), (1 1))", True), | |
| ("MULTIPOINT((0 0), (1 1))", True), | |
| ("POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))", True), | |
| ("GEOMETRYCOLLECTION (LINESTRING(0 0, 0 1, 1 1, 0 0))", True), | |
| ("GEOMETRYCOLLECTION (LINESTRING(0 0, 0 1, 1 1, 0 0), LINESTRING(0 0, 1 1))", False), | |
| ("POINT EMPTY", False), | |
| ("LINESTRING EMPTY", False), | |
| ("POLYGON EMPTY", False), | |
| ("MULTIPOINT EMPTY", False), | |
| ("MULTILINESTRING EMPTY", False), | |
| ("MULTIPOLYGON EMPTY", False), | |
| ("GEOMETRYCOLLECTION EMPTY", False), | |
| ("GEOMETRYCOLLECTION (LINESTRING EMPTY)", False), |
Since we're implementing more of this logic manually, we should test more comprehensively. Here are some more cases I suggest we add.
- One non-empty geometry for each of the 7 types
- GeometryCollection with a closed linestring and an open linestring (False)
- Empty geometry for each of the 7 types
- Non-empty GeometryCollection containing only empties
| }) | ||
| } | ||
|
|
||
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { | |
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { | |
| if is_geometry_empty(&item).map_err(|e| { | |
| datafusion_common::error::DataFusionError::Execution(format!( | |
| "Failed to check if geometry is empty: {e}" | |
| )) | |
| })? { | |
| return Ok(false); | |
| } |
I just queried PostGIS locally, it seems like it returns false for any empty geometry, so we should use the Sedona's is_geometry_empty function. The import for it is below. Note: I originally placed this inside of the final match arm below, but it turns out geo's is_closed() methods return true for empty linestrings (which is not what we want), so we need to have this check above the match statement. Otherwise, we'll return the wrong result for empty linestring, multilinestring, and geomcol.
use sedona_geometry::is_empty::is_geometry_empty;There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Abeeujah I think there's something else that failed before you pushed that last commit.
error[E0275]: overflow evaluating the requirement `G: GeometryTrait`
|
= help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`sedona_functions`)
= note: required for `<... as GeometryCollectionTrait>::GeometryType<'_>` to implement `ToGeoGeometry<f64>`
= note: the full name for the type has been written to '/home/runner/work/sedona-db/sedona-db/target/release/deps/sedona_functions-440740f413ee93de.long-type-15992899749200598173.txt'
= note: consider using `--verbose` to print the full type name to the console
from this action: https://github.com/apache/sedona-db/actions/runs/18559935149/job/52919073154?pr=219
I took a look, and (though I wasn't able to reproduce it locally, I think my suggestion below will fix it. Fingers crossed 🤞
| is_geometry_closed(item) | ||
| } | ||
|
|
||
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { | |
| fn is_geometry_closed(item: &Wkb) -> Result<bool> { |
I think this should fix the trait recursion problem.
| .to_geometry_collection() | ||
| .iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .to_geometry_collection() | |
| .iter() | |
| .geometries() |
optional nit: If you want, you can add use geo_traits::GeometryCollectionTrait to use this code instead. This change would've fixed one of the arms on its own, but I didn't find a way to avoid the to_line_string() and to_multi_line_string() calls (I didn't try for too long), so we need the fix I mentioned in the above comment.
edit: Actually, I prefer we do this because it's cheaper. to_geometry_collection creates a new vector to push everything to, while .geometries() iterates through the existing object.
petern48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a minor nit.
|
|
||
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { | ||
| fn is_geometry_closed(item: &Wkb) -> Result<bool> { | ||
| use geo_traits::GeometryCollectionTrait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional nit: We can move this import up with the rest
edit: actually it's whatever to me tbh (maybe dewey will disagree). i know it can be annoying to have to go through another CI run. Let's at least see if CI passes on this run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😅 It's fine on my end, Just felt like keeping the import there instead of atop with the other imports, but yeah, I'm fine with moving it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be at the top of the file (we do have some lingering accidental precedent where we forgot to do this in other places 😬 )
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! (And thank you to Peter for the review!)
If you don't mind taking care of the import that would be great but it's not strictly harmful either 🙂
|
|
||
| fn is_geometry_closed<G: GeometryTrait<T = f64>>(item: G) -> Result<bool> { | ||
| fn is_geometry_closed(item: &Wkb) -> Result<bool> { | ||
| use geo_traits::GeometryCollectionTrait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be at the top of the file (we do have some lingering accidental precedent where we forgot to do this in other places 😬 )
|
@paleolimbot I think the other placements might best be addressed by a follow up PR, to reduce the footprint of this PR, please do let me know what you think. |
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Thank you!
|
Great PR! Thanks for handling all the feedback. |
closes #215