-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCL point normals not always calculated correctly in sparse clouds #2403
Comments
Hey. Here's a couple of comments/ideas.
The points inside the highlighted region appear to be distributed as lines. There might be a chance that during normal estimation, your neighborhood search parameters only allow it to establish a neighborhood which is approximately a line, yielding unreliable normals. Can you randomly select a point used in the region you selected and highlight the points that are being used to estimate the normals?
There's a long lasting known problem caused by precision issues. You can read more about it in #560 and #1407. I also believed we also pushed a couple of PRs to master which affect covariance and eigenvalue/vector computation. You should definitely try the master branch if you can, just to see if you verify a different behavior. Give it a try and report results. |
Thanks very much for the help and the suggestions Serigo! I really appreciate it. First off, the point selection - this has been something I've worried about. So I did spend some time modifying some of my code to return the selected indices and colour them, which for this specific cloud yields: And the result is more or less the same, regardless of the input parameters. And in the picture with the normals shown (the one where the cloud is green instead of white) I used a search radius set to 0.3 m, which as you can see from the image just above, there should be more than enough points across multiple lines to produce an accurate normal. |
As for the second suggestion... Issue #560 talks about the issue and at the bottom, gcasey says that PR #1407 should fix it. But it looks like that PR hasn't been merged with the main branch? And appears to be failing a build check? It also looks like both you and Logan Byers proposed solutions to the problem in the comments of PR #1407, and that in Logan's case he found the two-pass solution wasn't robust/accurate enough for him. So I'll try his solution and see if I can get it to work... For the record I did (re-) install PCL 1.8.1 from source (the main trunk) less than two months ago, so it is relatively new. |
That's our baseline then and I agree that that should work to produce a decent normal.
You should try to compute things manually now for a single point. The normal is the eigen vector associated with the lowest eigenvalue of the covariance matrix. The most robust version of this algorithm I can think is resorting to the following steps, performing them separately with double precision:
Try to reproduce these steps with the highest precision available and validate the results with other libraries whenever possible e.g., once you have the covariance matrix (3x3) you can pass see what numpy outputs as eigenvalues/eigenvectors vs what the library is doing. Try to validate as much steps as possible. If you reach a stable setup for the normals then report back and we'll start digging when are things failing in the normal estimation algorithm. |
OK, so I kinda raced ahead and went for a full blown solution. I worked backwards down the chain of functions and figured out which ones are used during the normal calculation process, copied them into a modified version of normal_3d_omp.h/hpp and then just changed all the relevant instances of "float" to double. The result is REALLY hacky but it works: This looks a lot better than what I was getting before, and produces normals that look exactly as I would expect. This is using a search radius of 0.3m |
I can go through and try to validate each stage individually like you suggested, but I think the root cause is definitely the use of floats rather than doubles. So is there value in trying to verify each stage in the process or would it be better to skip straight to a solution? |
We have functions in PCL that support both single and double precision through template parameter, e.g. |
I guess that'd be the most straight-forward way to do it, then it's mostly automatic. That solution would essentially involve adding a new declaration of the important functions, at the very least the following:
Which I assume would also require creating new definitions of the class variables In addition, what options would the user have to force the use of the
How does the underlying code know to use floats or doubles? The effect this bug/issue can have is pretty dramatic (as I'll show in a post below) so I feel like it'd be important to make the user away of the difference and give them the option to force one case or the other. |
So I've been testing matching sequences of 35 clouds, and just incrementally matching each of the 35 back to the first cloud. This bug/issue has a pretty major impact on how well the clouds are matched in some environments. So in all the pictures below, the left hand image is the match BEFORE I fixed this issue, and the right hand image is AFTER fixing it with my hacked code. In some areas it doesn't make much difference, either because the match was already pretty good or because there isn't much that can improve it: (NOTE: The pictures are actually pretty big, so click on them for a larger version, rather than squinting at what you see below) But in other areas it is the difference between a good match and a useless one: So for my application, using doubles rather than floats makes a world of difference. |
Tentatively, I'd propose the following:
|
We should go straight for the solution. Your time is precious and we should use it efficiently.
The user will need to explicitly request double precision through template parameters in case it desires it. In the planeWithPlaneIntersection<double>(...) @taketwo we should consider defaulting actually to double, since we often have reports of bad normals. The user who wishes speed then can explicitly take the risk of using single fp precision. |
I'd like to clarify that I propose template parameter-based selection of precision class only for free functions. For the
Do we understand in which cases single is okay, and in which double is needed? Does in have to do with pointcloud extents? Maybe we can have a heuristic that selects precision automatically if it was not set explicitly. |
That's the current intuition. Whenever points stray away from 0 too much, we start having problems.
Hmm. I think I'd rather keep it simple for now. But to provide an informed opinion I need to have an intuition on the time penalty we're incurring for jumping into double precision. Something which reports a time metric per normal point and per the local neighborhood size. |
Well, if this is the only reason, then we should just de-mean point neighborhoods before computing covariance matrix. Otherwise, what if points stray even further away from 0? Provide quadruple precision implementation? |
@msy22 can you do us a favor and run the PCD from the linked issue through your modified code? If you get good results then we can close that issue in favor of this one. |
This brings in the second parallel discussion that happened in #1407, regarding a single vs double pass to compute the mean and the covariance matrix, specifically the errors precision errors incurred on the former. @msy22 you might want to try first things with the double pass mean and cov estimation. Please into account this proposed approach. |
Ok, first things first - I've downloaded that .pcd file from issue #2372 and put it though the unmodified and modified versions of my normal estimation code (as per @taketwo's request). I've included some quick results below. For this test I used a point search radius of 2.0 meters since the point distribution is much sparser than my clouds, and has an odd banded structure (I'm guessing this was taken using aerial LiDAR). For an indication of scale, here's a point with a 2m radius highlight in red: Now as for the accuracy of point normals, here's a comparison. On the left is unmodified (i.e. floats) and on the right is modified (i.e. doubles). As before, click on the pictures for a bigger version. However, it can still be a little hard telling how much of an effect using doubles has had, so here's the same comparison as above, but zoomed in on the roof of the building so you can see the difference in normals a bit more clearly. There are a couple of things that stand out to me about this:
|
So in short, we have three possible causes of poor normal calculations (assuming no other problems come to light):
My results suggest that 1 is a significant contributing factor, but the results from issue #2372 suggest that resolving problem 1 may not be enough, and that problem 2 also needs to be resolved. And as @SergioRAgostinho has already pointed out, the method of doing a double-pass may also improve things. Without knowing exactly which of the above problems (in any combination) we need to solve in order to get the best normal calculation, we can't confidently provide a proper solution. So I'll implement a de-meaning function and double-pass calculation (following @SergioRAgostinho's suggestion) then put my clouds and the xyz.pcd cloud though everything to see which combination of fixes gives the best result. Now at this stage I don't have any method of obtaining a "ground truth" for the normals... so aside from eye-balling the images I don't have any way of precisely (in degrees or some other metric) comparing results. If either of you have any suggestions in this area, I'd be glad to hear them. Otherwise, depending on how some meetings go over the next two weeks, I may be able to convince my supervisors to let me spend more time on this problem, and I may be able to do something about obtaining a ground truth... |
@msy22 thanks for giving it a try and also for the detailed analysis.
What would be interesting to see, is whether resolving 2 alone is enough. In other words, is single precision sufficient given that the point cloud is at the origin? If that is the case, I have an idea how we can get away even with a single-pass single precision algorithm. We don't need to de-demean the point cloud exactly, we only need to bring it close enough to the origin. In the context of normal estimation, we are typically dealing with compact groups of neighboring points, as discovered by a spatial search backend. Thus we can approximate the centroid of such a neighborhood with the query point used for search, or, equivalently, the point at which the normal is being estimated. (I would even speculate that any point in the neighborhood will be good enough). This is trivial to implement and amounts for a one additional SSE instruction per point (to subtract the "central" point), and then one more to adjust the computed centroid. What do you think? Any reasons why this would not work?
I'd simply go with synthetic planar point clouds translated away from the origin. If time permits, we can add a unit test based on the real data. For example, crop the ground in one of your LIDAR point clouds, fit a plane, and test whether resulting normals are close enough to plane. By the way, #1407 includes additional unit test which can be re-used. |
Thanks @taketwo!, I'll address that shortly, although I don't have much experience with the PCL code base and virtually none with SSE, so I may be less help than you think haha. But before I get too behind, here is the results of my testing. MethodTo de-mean my point clouds, I calculated the centroid and applied a transformation like so:
Although I've since realized there's a To implement the double-pass covariance calculation method, I used a slightly modified version of the code snippet @SergioRAgostinho posted:
To implement the double precision calculations I used the method I've already described (just a local copy-paste of a few functions redeclared with doubles instead of floats). ResultsHere's the visual result for the whole cloud: And again zoomed in on the roof, which seems to be the clearest place in the cloud for inspecting normals. Conclusions
|
Just to add to the above, de-meaning my point cloud solves the problem as well, the image below shows the unmodified cloud on the left and the de-meaned but still calculated using single (float) precision cloud on the right. For reference, the centroid I subtract from this cloud is XYZ = 583.75, 433.07, 16.7171. Also note that I've been de-meaning my point clouds before putting them into the visualizer for a while now, since the viewer is bloody impossible to control well if the cloud is far from the origin. But until now I'd never de-meaned the cloud before calculating the normals. So to answer your question @taketwo:
Yes, it looks like it might be. I've run my code with both de-meaning and double precision, and visually I can't tell the difference between the result and just de-meaning. So there's probably no way to tell the difference between the two (if there is any) without a mathematical proof or a metric for measuring normal accuracy. As for whether it'd be enough to de-mean the whole cloud or each cluster individually, that's another matter. And the synthetic planar clouds is a good suggestion. It'd be easy enough to create a basic environment and the associated points using the method shown in the PCL interactive ICP tutorial. It's certainly be a nice way to more thoroughly validate everything we've discussed, and a good way to test other parameters that might affect normal calculation. |
If you feel like undergoing a more challenging synthetic example, computing normals from points densely sampled from a spherical surface should also do the trick.
@taketwo might have abstracted you from SSE details with #2247. Nevertheless, if you hit a problem, ask for opinions/options and we'll always chime in.
I would demean on a per cluster basis. (Assuming the demeaning operation to be practically inexpensive). On a side note: you have my praise @msy22 . Great analysis and presentation. We don't get that often so... take my kudos. |
What you have done looks correct. But the result is surprising. For the record, #1407 includes a different take on two-pass algorithm. If you have time and interest, you can give it a try.
Exact de-meaning requires two passes and we don't want that. Approximate de-meaning (as per my suggestion) yields better and better approximations the smaller the cloud is. So it's best to perform it at the neighborhood cluster level.
I did not mean that we need to code SSE explicitly, hopefully the compiler will do the job. I just wanted to stress it'll hopefully map to a single operation, thus won't be expensive. Specifically, in this code: pcl/common/include/pcl/common/impl/centroid.hpp Lines 564 to 577 in 9075066
Just compute "approximately" de-meaned point: const Eigen::Vector4f point = cloud[*iIt].getVector4fMap() - cloud[indices[0]].getVector4fMap(); And use it further to update |
That's a good suggestion. If I'm going to do this properly I should start with simple primitives like a cube, plane or sphere and work my way up through more complex objects to real data. But that will all have to wait until after this issue is resolved.
Thanks! I appreciate that!
Great, thanks for the link! And I'll do my best to get it working.
That's why I was concerned that maybe I hadn't implemented it properly. I'll have a quick crack at the other method before moving on. Because the two-pass method was central to that PR, it would be good to be able to definitively rule out/in that method, otherwise it's a loose thread.
As for the subject of demeaning the whole cloud vs individual clusters, would demeaning just the cloud be faster and still provide the same benefit? I may end up editing this later today after I learn more, but my naive assumption is that if we de-mean each cluster that will result in more operations, depending on how it's implemented. Say for the sake of the argument that we de-mean each cluster around the point we're calculating the normal for, and that we're doing a K-nearest point search where K=10. We de-mean that point and it's 10 nearest neighbors. Then we calculate the normal for the next point in the cloud which is likely just the nearest neighbor of the first point. We repeat the same process and end up demeaning basically the same cluster of points all over again. So we essentially end up de-meaning the same point multiple times in slightly different ways. My worry is that an implementation like this could result in orders of magnitude more calculations than simply de-meaning the whole cloud (i.e. each point once). In the mean time, I'll look at the alternative double-pass calculation. Then I'll fork the PCL repo, start familiarizing myself with the necessary code, and write up a plan/to-do list which I'll post here so you can both sanity check it. |
Potentially not. If your entire point cloud is very uniform, the centroid is at the origin but there are still points at far away positions, you'll still end up with the same original problem. This is why "demeaning" needs to be done on a per cluster basis.
You're correct. But the underlying assumption intuition is that this operation is inexpensive. Of course we now need to benchmark the processing time to confirm this hypothesis. |
There are many factors involved and only benchmarking can show what is faster. But here is a speculative argument why "simply de-meaning the whole cloud" approach may be slower. For the sake of the argument, let's assume that we even know the centroid in advance. Now, note that we can not de-mean the point cloud in-place, which means that a new cloud of the same size has to be allocated. Then we need to make a pass through all points (read access), perform subtraction, and write back the results (write access). And then move on and compute covariance matrices for each cluster. So the overhead of "simple de-meaning" is an allocation and a read/write access to all points. Compared to that, per-cluster de-meaning needs a single pass and no additional memory. And we all know that memory operations are waaaay slower than arithmetics, even if we are lucky with cache hits. |
Aaaah, that's really valuable to know. Ok it is less straight-forward then. (For reference, only one 3rd of my background is in software, and operation costs/code optimization is a gap in my knowledge I haven't had time to fill yet).
OK, judging by PR #2247 it looks like you already have processes in place for testing the computation time of different bits of code. So then what I need to do is implement both solutions quickly and roughly so you two can then test it and prove with data which of the two options is faster. If looks like the best place to de-mean the whole point cloud would be somewhere in So what would be best for you two? Would it be better for me to modify a fork of the PCL code base so you can clone, compile and test that? Or would it be easier and faster if I do something like what I've done to get this far - copying the relevant functions into a custom version of |
Well, we have that one instance when we did extensive benchmarking. However, this is not an established process. I have created a new project for benchmarking and testing mean/covariance computation. It's largely based on the repository from #2247. It's just a skeleton, but it should get you started. In future there is a bunch of things that would need to be verified, e.g. whether number of samples/iterations in adequate, whether we are indeed benchmarking what we want to benchmark (check assembly), make the size of neighborhoods parametrizable, etc. |
Argh yeah that is definitely not supposed to happen. That term specifically is being generated by this line in EDIT: Yeah that's exactly what seems to be happening, when I compute that line with floats I get:
But when I calculate it with doubles I get:
Whether |
Here's a text file with the results from a dozen or so test point: covariance_testing. And the source code I used, just in case: Source
void
testCovariance(string cloud_filename, int idx, int k)
{
// Load one test point cloud
//cout << "Loading point clouds..." << endl;
PointCloudT::Ptr cloud_in (new PointCloudT);
cloud_in = LoadPointCloudFromFile(FILEPATH + cloud_filename);
Eigen::Vector4f n;
float curvature;
Eigen::Vector4f eVals;
// Find its k-nearest neighbours
std::vector<int> nn_indices(k);
std::vector<float> nn_dists(k);
pcl::search::KdTree<PointT>::Ptr kdtree (new pcl::search::KdTree<PointT>);
kdtree->setInputCloud(cloud_in);
kdtree->nearestKSearch(idx, k, nn_indices, nn_dists);
// Create a sub cloud containing just point i and its nearest neighbours
PointCloudT::Ptr tmp_cloud (new PointCloudT);
tmp_cloud->width = k + 1;
tmp_cloud->height = 1;
tmp_cloud->points.resize(k+1);
for (int j=0; j < nn_indices.size(); j++)
{
tmp_cloud->points[j].x = cloud_in->points[nn_indices[j]].x;
tmp_cloud->points[j].y = cloud_in->points[nn_indices[j]].y;
tmp_cloud->points[j].z = cloud_in->points[nn_indices[j]].z;
}
tmp_cloud->points[k].x = cloud_in->points[idx].x; // Also copy in the point we want the normal for at the end
tmp_cloud->points[k].y = cloud_in->points[idx].y;
tmp_cloud->points[k].z = cloud_in->points[idx].z;
// Calculate the covariance matrix and print it
cout << "\n### Testing covariance calculations of point " << idx
<< " at origin ### \n" << endl;
Eigen::Matrix3f covariance_float; Eigen::Vector4f centroid_float;
computeMeanAndCovarianceMatrix(*tmp_cloud, covariance_float, centroid_float);
cout << "Covariance matrix: " << endl;
print3x3Matrix(covariance_float);
cout << "Eigen values: " << endl;
pcl::eigen33(covariance_float, eVals);
Eigen::SelfAdjointEigenSolver<Eigen::Matrix3f> es(covariance_float);
cout << " eigen33 = [ " << eVals[0] << ", " << eVals[1] << ", " << eVals[2] << " ]" << endl;
cout << " EigenSolver = [ " << es.eigenvalues()[0] << ", "
<< es.eigenvalues()[1] << ", "
<< es.eigenvalues()[2] << " ]" << endl;
cout << "Point Normal: " << endl;
pcl::computePointNormal(*tmp_cloud, n, curvature);
cout << " n = [ " << n[0] << ", " << n[1] << ", " << n[2] << " ]" << endl;
// Shift cloud to be far from origin, re-calculate covariance matrix
cout << "\n### Testing covariance calculations of point " << idx
<< " at xyz=1000 (float) ### \n" << endl;
MoveCloudToCoordinate(tmp_cloud, 1000.0, 1000.0, 1000.0);
computeMeanAndCovarianceMatrix(*tmp_cloud, covariance_float, centroid_float);
cout << "Covariance matrix: " << endl;
print3x3Matrix(covariance_float);
cout << "Eigen values: " << endl;
pcl::eigen33(covariance_float, eVals);
es.compute(covariance_float);
cout << " eigen33 = [ " << eVals[0] << ", " << eVals[1] << ", " << eVals[2] << " ]" << endl;
cout << " EigenSolver = [ " << es.eigenvalues()[0] << ", "
<< es.eigenvalues()[1] << ", "
<< es.eigenvalues()[2] << " ]" << endl;
cout << "Point Normal: " << endl;
pcl::computePointNormal(*tmp_cloud, n, curvature);
cout << " n = [ " << n[0] << ", " << n[1] << ", " << n[2] << " ]" << endl;
cout << "\n### Testing covariance calculations of point " << idx
<< " at xyz=1000 (double) ### \n" << endl;
Eigen::Matrix3d covariance_double; Eigen::Vector4d centroid_double;
MoveCloudToCoordinate(tmp_cloud, 1000.0, 1000.0, 1000.0);
computeMeanAndCovarianceMatrix(*tmp_cloud, covariance_double, centroid_double);
cout << "Covariance matrix: " << endl;
print3x3Matrix(covariance_double);
cout << "Eigen values: " << endl;
pcl::eigen33(covariance_double, eVals);
Eigen::SelfAdjointEigenSolver<Eigen::Matrix3d> es2(covariance_double);
cout << " eigen33 = [ " << eVals[0] << ", " << eVals[1] << ", " << eVals[2] << " ]" << endl;
cout << " EigenSolver = [ " << es.eigenvalues()[0] << ", "
<< es.eigenvalues()[1] << ", "
<< es.eigenvalues()[2] << " ]" << endl;
cout << "Point Normal: " << endl;
pcl::computePointNormal(*tmp_cloud, n, curvature);
cout << " n = [ " << n[0] << ", " << n[1] << ", " << n[2] << " ]" << endl;
} |
👍 The LoS theory crippling cov matrices is confirmed. Now it's time to dig into eigen33. Thanks for the files and code snippet. I'll try to spend some hours over it during the weekend. |
Agreed. Unfortunately this raises an ugly question - the results I have indicate that the normals calculated when LoS occurs are highly random. But if something is going wrong inside eigen33 as well, then there may be two major sources of error at play here, and normals affected only by error from the covariance calculations may be more deterministic that what my results currently show. I'm going to re-write the code snippet I've got above and try to calculate the normal via eigen33 and via the EigenSolver, then report back. |
At this point I believe you need to assume there's something off in
|
Yeah... I'm gonna have to re-calculate my plots. No way around that unfortunately. So here's the new file of results: covariance_results The output of which looks like this:
And the file I used to generate it: test_eigenvalue_calculation A bit of quick testing shows that the normal is still wrong, even when calculated via EigenSolver. Sometimes the EigenSolver result is better... sometimes it's worse... still seems pretty random but when I recompute my plots that will hopefully make things a bit clearer. I also tested all three versions of |
I've wrote an email to Matt about this already but forgot to post here. eigen33 is working fine, despite needing some serious refactoring. It just produces rubbish and diverges from the SelfAdjointSolver from Eigen when the data fed to it doesn't respect the required assumptions of being symmetric and positive semi definite matrix. If these are respected the results are similar to Eigen's SelfAdjointSolver in what concerns eigenvalues' values. The eigen vectors are negated but still correct. The conclusion here is that bad covariance matrices are the root cause of the normals problem. So to fix and close this issue finally, we need to implement that local neighborhood translation and the problem will be fixed. |
Wow what a journy of an issue, amazing. Thank you all for the hard work. Do I see this right, that this issue still exists in the newest pcl version and no fix is available, besides doing per cluster demeaning ourselfs and rebuilding from source? |
There's a Pull Request addressing this issue specifically for normal estimation. See #3169. It was reviewed and there are some minor changes that need to be applied to it, but as is, it is functional. This is such a critical issue with a pending PR that it will likely be merged for this next release. |
Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs. |
Is this issue still present? It seems kind of important. |
I'm pretty sure it has been solved by #4983 |
The problem with the wrong normals (if the cloud is far from the origin) is indeed fixed by #4983 (we went with per-cluster demeaning and kept float instead of switching to double, for details please see the PR). Why are you asking @themightyoarfish ? Did you have an issue like this? If yes, what issue exactly and did you update PCL to the latest version 1.13.0? |
No, I do not have a specific problem, I just stumbled upon this issue and I might run a similar use case in the future. |
Your Environment
Context
I'm currently trying to find a point cloud registration method that can accurately match relatively sparse 3D point clouds from one Velodyne HDL-32E. At the moment I am using PCL's implementation of Generalized ICP to match several sequential scans similar to the one shown below.
(This is one scan of a carpark outside an office building with floor-ceiling glass windows, hence little few LiDAR returns from the wall).
I.e. I take one scan like this and try to register it to a scan taken 1-2 cm further forward. Now, matching sequential scans does work in some areas but not in others. Since GICP depends on good normal calculations I've been inspecting some clouds in the PCL visualizer (shown in this tutorial) and I noticed that in some places the normals are not being calculated correctly at all.
Expected Behavior
The normals in most parts of the clouds should be calculated correctly. Or if they aren't, it's obviously because the data is too sparse in that region of the point cloud.
Current Behavior
The normal directions are mostly correct, but in some places where there is plenty of data, they are not calculated correctly. The most obvious place is in this example, where the normals all point more-or-less upwards in the flower bed part of the wall. This is incorrect, the normals should be orthogonal to the wall and pointing out horozontally:
This is the case in this specific area pretty much regardless of whether I use a KNearest or Radius point search, and regardless of the values for K or for the radius.
Code to Reproduce
Try the cloud in question: carpark_cloud.ply
in the PCL visualizer (shown in this tutorial)
Possible Solution
I have two theories on what the cause of the problem is.
So my question is: Which of these two theories is likely to be correct, and are there any suggestions on what I can do to resolve it?
The text was updated successfully, but these errors were encountered: