-
Notifications
You must be signed in to change notification settings - Fork 227
[extensions][io] Add read_shapefile() implementation. #554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mloskot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only skimmed the code, but it is neat and a good idea.
|
Thanks! I am ok with merging. Could you please explain why you need the third case in your description (range of multi-linestrings)? Is not part of case 2? |
It's because shp file contains a number of records of the same type. In case of linear geometries these records are of type polyline. A polyline can contain some number of parts. So if one wants to represent a range of linestrings then in the shp these will be polylines containing 1 part each. However they could have more parts in which case they would be multi-linestrings. So I decided to leave the decision to the user of the library. I forgot to add range of multi-points which can be expressed in shp file directly.
EDIT: The above is invalid shp polygon can contain any number of outer and inner rings so one have to create polygons based on the ordering. This is not done right now. There is also multi-patch which AFAIU is 3d mesh so this is not supported. |
|
I added link to technical description in case you wanted to check it out. |
|
It seems that shp polygon may contain any number of outer and inner rings in unspecified order. This means that my previous comment about multi-polygons is invalid. This also means that a polygon should be loaded, the orientation of all of the rings checked and inner rings assigned to outer rings. Side question: @vissarion we know that currently geographic area strategy may be innacurate for small polygons (#541) but is it only the value that is inaccurate or is the sign innacurate too? I'm asking because area is used to check the order of points. Can we rely on area in this case even for small polygons? |
|
Cool! |
|
@vissarion I already know that the result of area can be negative, e.g for this polygon: from Germany map (https://www.statsilk.com/maps/download-free-shapefile-maps) I get the following areas: What is interesting is that perimeters look close: Area state values at the end of calculation with andoyer: One is spherical term and the other ellipsoidal correction. It seems that the error is caused by the fact that they are small and close to each other. Are we sure that the spherical calculation is fully consistent with ellipsoidal correction calculation? E.g. there are 2 versions of thomas (series expansion) one using reduced latitudes and one using raw latitudes and both have different quirks. The result in edge cases may heavily depend on what's exactly calculated in a formula. So my guess is that the spherical calculation in ellipsoidal Are we sure that everything is ok with area? |
|
@vissarion @barendgehrels Because area is not reliable for checking the order of points in a ring I'm thinking about replacing the test based on area with a different one, based on sum of exterior angles. The sum of exterior angles of a ring should be I've implemented a simple geographic version and it solves the problematic case mentioned above. What do you think? EDIT: it would probably be a separate algorithm with separate strategies, cartesian based on area and non-cartesian based on azimuths/exterior angles. These strategies would be passed into |
|
@awulkiew for your example the area algorithm first computes the azimuths (included the series method of #500) then the spherical terms computed (using azimuths). Actually, spherical terms are areas of the trapezoid where These are the values for our example above (first 3 columns are terms for each one of the segments of the triangle and the last is the sum). That is, for small segments the difference in the areas are small and then an error in the 6-th decimal digit in the azimuth computation for andoyer leads to an error in the 10-th digit for the spherical term but the sum is of order |
|
@awulkiew regarding ordering and area I also think using area is an overkill and what you suggest sounds great! For cartesian both ordering and area computations rely on determinants (actually the first only needs the sign so there is also space for optimizations here) but for spherical and geographic things are totally different so what you say is very natural. |
|
I could not review this weekend, now planned for Wednesday |
|
No need to hurry. Before merging this PR I'd like to propose another PR: point-order check not based on area in non-cartesian coordinate systems. So this would be a separate algorithm (probably detail for now) with separate strategies which could be used here but also in correct, read_wkt, set operations and buffer. |
49fa115 to
b967e1c
Compare
|
I've implemented an algorithm which solves the issue above: #561 |
b967e1c to
592b774
Compare
|
Rings order is now checked and interior rings assigned to exterior rings. I added IO strategies for this. These are umbrella strategies without This PR now depends on: #561 |
|
Started review - compilation errors and warnings marked in the review (easy to fix), thanks for the example, easy to implement and runs fine. Will continue later. |

This PR adds simple
read_shapefile()IO reading*.shppassed as STL IStream and filling a mutable Range of Geometries suitable for it:It doesn't depend on external libraries, it only loads geometrical data from
*.shpfile so it's lightweight.See: shapefile technical description.
Example:
Result: