Realistic input coordinate precision #2949
Replies: 6 comments 11 replies
-
|
From my point of view, since I'm dealing with real world data (OSM) I'm totally fine with some "rounding". |
Beta Was this translation helpful? Give feedback.
-
|
Maybe there is a place for turf-lint? Though whether it's an advisory debugging tool only, or a runtime input data fixer, I'm not sure. The spec is probably vague enough in a few areas that it might be tough to nail down the latter. There is @placemarkio/check-geojson though it specifically doesn't look at precision, or ring winding which is a pretty common issue. |
Beta Was this translation helpful? Give feedback.
-
|
The RFC suggests 6 digits of precision, which conveniently also fits in a float32 which doesn't help us in javascript runtimes but would be meaningful for people interoperating with another language. |
Beta Was this translation helpful? Give feedback.
-
|
Please correct me if I'm wrong, but I think it is fair to assume that the main use cases for a library like turf are going to be for visualizing real-world data and running lightweight analytics. I could very easily be wrong about the level of analytics that people use mind you. I'm also working on the assumption that real world data is messy, and while its 100% reasonable to assume spec compliance, assuming anything beyond that (like no repeat points or certain spacing between points) is not reasonable. So with those assumptions, my 2c is:
As a testing aside, there is a mixture of exact precision tests (which definitely shouldn't exist) and approximate precision tests. To @mfedeerly's point about precision, I wonder if it would be a good idea to create a small set of helper assertion teset wrappers to use as approximately equal to an agreed number of significant figures then roll them in over time. |
Beta Was this translation helpful? Give feedback.
-
|
Apologies for the delay, got busy but didn't want to drop the conversation. And apologies also for the long comment. I've been ruminating on this topic and thought that posing some opinionated takes in order to push the envelope and prompt discussion might help. I'm not a GIS super-expert, so this is more the perspective of an enthusiastic amateur. Some of the thoughts below are not necessarily the best for my personal turf use, they more come from where turf might fit into the ecosystem. So my opinionated stance is that turf is:
With this background, it points to a solution of:
The obvious trade off is precision. But the big opinion here is that if one wants high-precision, use GEOS and/or research the use case, don't expect a JS library, albeit an awesome one, to give you that. So what might some implications of this philosophy be? Promise Minimum Precision
Apply Cleaning
Avoid Throws - Assume, Null, Warn
ConclusionThere are definitely holes in this - something that springs to mind is truncation → co-incidence → failed topology that may not have failed without the truncation. I'm sure there are other gaps in my thinking too. Also in some cases (clipping...) it might be a decent philosophical change and therefore not be viable. So there's some opinions that hopefully let folk agree or disagree. |
Beta Was this translation helpful? Give feedback.
-
|
I really appreciate the detailed writeup here! For what its worth, I don't consider myself a deep GIS expert. I originally started contributing to work on getting the repo publishing reliably again. I largely agree with your writeup, but want to push back on a few points. Performance: Although we won't aim for performance as fast as optimized native code, we still care about performance. We can still try to use the fastest algorithms and write performant code along the way. @turf/union changed its underlying implementation for correctness fixes, which introduced a large performance regression. I'm working on a fork of polyclip-ts that swaps bignumber.js back to Javascript float64's in order to fix the regression. You also have access to many cores. Even in a browser you can use worker threads to do more expensive computations without blocking the user. Precision: We are in Javascript and the language limits us by only having native float64's. Luckily, the precision of this type is much higher than any physical scale that we intend to deal with. Even 8 digits after the decimal gets us to ~1mm precision. Several fewer digits is more precise than you'd be able to see on a Leaflet visualization of the data at zoom 18. I do think that we need to add official documentation about the precision limitations of float64's, the realistic scale of the errors, and a policy around rejecting issues submitted that require precision past where we support. I'm on board with publicly stating that we officially state we support 6 decimal places, and perhaps we pair that with truncating test fixture output at 8 decimal places so we have some breathing room. We may also want to specifically call out packages that are sensitive to precision issues in their own README's (and therefore the public docs pages). Data cleaning: I strongly perfer that individual methods do not clean their input data. It can be very expensive to do a clone and truncate operation on large inputs, and the cleaning won't even neccessarily be required depending on the input data itself. If someone does have data that requires truncation before operating on it, I think it is a reasonable tradeoff to make them manually do so in order to preserve performance for everyone else. To that end, we may want to update the TypeScript definitions to take in deeply readonly arguments, and return mutable results (assuming we're generating entirely new objects). Throwing vs warning: If something is going wrong I'd much rather have that loudly declared instead of quietly patching over an issue. We already try to throw errors at runtime when input arguments are not valid, which is preferrable to getting a less obvious error when the operation itself fails. I think someone is very unlikely to find a Happy to hear thoughts from others on this as well before we start taking action on the precision thing. I do think we're overall limited by how many contributors we have and how much time those contributors can spend on the project. The polyclip-ts regression is nearly a year old, and I'm just now getting around to working on it. We're currently in a relatively active point of development, but there's really only 2 people with maintainer status at the moment. I'm reluctant to just force merge my own PRs without an approval from another person, so we kind of need both of us to have time to make any changes. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
There have been a few bugs recently that seem to relate to extremely small distances being represented as geojson. One recent case involved a line segment 0.6 nanometers long. Which seems to leave the underlying Javascript math functions no room to move, and degenerate elements creep in.
I'm wondering if we should start pushing back a little on error cases that go beyond six or seven decimals of precision (approx 10cm) at least in the first instance. Start taking the "geo" in geojson at face value. An expectation has possibly developed that Turf should be fast and robust and handle an arbitrary number of decimal places, and that's probably not something we can deliver on with JS.
Tagging a few people recently involved with issues along those lines. Please weigh in with your thoughts 🙏
@SimplyPancake @HarelM @bratter @mfedderly @JamesLMilner
Beta Was this translation helpful? Give feedback.
All reactions