-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw an exception when potentially losing precision in FDBSCAN-DenseBox #560
Conversation
It just occurred to me that it may be possible to propely fix this. As long as the cell box is expanded to contain the points inside it, this should produce correct result. The question is: how does one guarantee that the (rounded) But I wonder if this is the only place. We do some similar calculations in determining cell indices and such. Maybe it's just easier to warn. |
Currently I don't have a better idea, so lets go with the current version. |
Please review. |
As currently implemented you would always get the runtime error ArborX/src/details/ArborX_Exception.hpp Lines 33 to 38 in 6aca3b8
|
238ccfe
to
19d681a
Compare
@dalg24 Can we merge this PR in its current form? It's not robust, but improves status quo. |
19d681a
to
fb5f8f5
Compare
Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must fix the style
fb5f8f5
to
95595f7
Compare
For some problems and eps, FDBSCAN-DenseBox may suffer from severe loss of precision when computing cellBox(). This may result in wrong DBSCAN results. Initially, the problem was observed on a full NGSIM dataset: ``` $ ./ArborX_DBSCAN.exe --binary \ --filename /data/data_and_logs/2021_02_20_dbscan_datasets/mustafa_2019/ngsim.arborx \ --print-dbscan-timers --core-min-size 2 --eps 1.0 --max-num-points 90000 \ --impl fdbscan-densebox --verify <...> Core point is marked as noise: 41888 [-1] Core point is marked as noise: 17027 [-1] Core point is marked as noise: 43104 [-1] ``` Tracking down the issue revealed that a point 2279 (part of a dense cell) was finding point 10369 (not part of a dense cell), but the reverse was not true (i.e., 10369 was not finding the dense cell that 2279 was in). Studying the dense boxes, I observed that the problematic box had bounds [[6452399.000000,1872182.375000,0.000000], [6452399.000000,1872183.000000,0.577350]] Note the same x coordinate for both min and max corner. The h in this run was ~0.57. So it completely got lost. Even Y coordinate lost some precision, though not as much.
95595f7
to
4d83a65
Compare
For some problems and eps, FDBSCAN-DenseBox may suffer from severe loss
of precision when computing cellBox(). This may result in wrong DBSCAN
results.
Initially, the problem was observed on a full NGSIM dataset:
Tracking down the issue revealed that a point 2279 (part of a dense
cell) was finding point 10369 (not part of a dense cell), but the
reverse was not true (i.e., 10369 was not finding the dense cell that
2279 was in). Studying the dense boxes, I observed that the problematic
box had bounds
Note the same x coordinate for both min and max corner. The h in this
run was ~0.57. So it completely got lost. Even Y coordinate lost some
precision, though not as much.
I'm not sure how robust this is. I also don't know what the best way to indicate this to user. My issue with using
ARBORX_ASSERT
is that when a user compiles ArborX in release mode, this will never get noticed. On the other hand, hardcoding aprintf
statement contradicts xSDK policy on having no hardcoded printouts.