-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akaze's implementation seems to have issues with rotational invariance #63
Comments
I believe I have found some fishy code: |
So the implementation was based off an existing Rust implementation and I think another independent implementation instead of OpenCV. I'm not sure if it was compared to OpenCVs implementation - but for more complicated algorithms I personally avoid that sometimes as OpenCV applies a lot of heuristics that may not be in the paper (from experience the OpenCV HOG adds in image pyramids and other things other implementations and the paper don't use). The fishy line you spotted does seem to be a problem so a PR for that would be good. Does it improve the output at all? I'm also not super familiar with this area - only mildly familiar as a casual user - so am likely not best to comment but I can try to help out. One question would be how does the matcher impact this? I assume the visualisation is just showing the points that match between both and not the points where no match was found. Is there some worth in showing all the points identified in the rotated and non-rotated versions including unmatched ones to see if something's up. Although there are a lot of points so not sure if that's useful at all 🤔 |
Yes showing all points is likely to be a lot. I think that showing only matching lines, as now, is enough so far. The matcher is the brute force one and the selection criteria is the same as in the OpenCV example code I'm using as a comparison. I found other bugs in the translation from the original C++ implementation the previous Rust implementation is based on. You can follow my progress in this PR. I'm comparing the Rust code with that original implementation. The one line correction above doesn't improve the performance so far. But correcting two more bugs (the missing reset of |
Additional note, |
After comparing the evolutions of the same image, I can confirm that they are very different, so probably there is a bug already at that level in the Rust version. Here are the levels 0 and 1, on the first image above: |
Already |
I found that the |
I guess it's an implementation detail of the gaussian blur regarding padding and the imageproc one applies some form of padding whereas cv doesn't |
No, it is worst than that, it happens in the center of the image, not in the borders. The Gaussian blur is seriously degraded. I have the impression that there is an incorrect optimization somewhere. But there are so many other bugs that at the moment I'll live with Fyi my approach to correct Rust CV's Akaze implementation is to dump the intermediate generated images (e.g. for the different maps of the evolutions) and compare with the ones from the original C++ implementation, that I have instrumented to generate the corresponding images at the same positions. That shows that, for example, the Scharr filters are also incorrect, leading to an error of a factor 40 in the estimation of the initial contrast for example. |
Okay, I know this is used as a key part of the VSLAM toolbox for reconstructing 3D point clouds from things like KITTI. Once the PR is ready it would be good to rerun that and make sure it still works as expected (oh god I might have to get vulkan working on my linux machine). I may also prod @vadixidav to take a look at it as he's more familiar with the code/domain than I am. I'm aware that code like this can be tricky to test but it would be good to see some tests to catch certain issues discovered and try and prevent any regressions. I'll try and allocate some time later this week or next week to look into the gaussian blur some more |
Ok, good! I fully agree that it is worth checking that there is no regression. Also, there is quite some re-implementation of features present in imageproc (e.g. Gaussian filter, Schaar filters, etc.). Personally I would prefer using the versions there, or contributing better implementations there if needed (but maybe the current implementations there might be faster now). Also, in general, I would try to have similar functions performing similarly as their OpenCV counterparts. Of course, simplifications and differing details such as border management might be acceptable. And I'm also all in for having tests ;-) |
When this crate was started imageproc/image didn't support having floats as a pixel type and iirc there had been an open issue for quite a while and it needed a big refactor to be possible. It may be that the reimplementations are no longer necessary |
The diffusion function Another possibility is that the Rust code was based on an earlier version of the reference implementation. In any case, adding a missing 0.5 and changing a multiplication to an addition solves that problem! Now the non-linear space is created correctly compared to the reference implementation! |
@xd009642 one question: to debug I have instrumented the source codes of both the C++ and the Rust versions to dump quite some EXR images of the internal space on the way. Do we want to keep that in the future? It is quite intrusive, but if we want to ensure no regressions, we might want to keep that and compare with reference images. Is there an established best practice for such things in Rust CV? |
There's not currently an established best practice, and I know one of the goals was working on things like embedded linux boards which makes the intrusiveness even more painful. I think for the current PRs don't submit it. My thoughts are that we may want to add a ability to compile with something like |
We could also gate some introspection features behind |
hmm - my initial thought is that such introspection is also useful for engineers working on computer vision applications so I'd rather it was configurable by users as well. But I may be biased as someone who's had to patch opencv with that kind of debug dumping before. Either way it's a question for a future PR imo |
Sure. I have them all around, but I'll refactor that into two commits before submitting the PR for inclusion. |
Hi, I'm following rerun.io and I think they are building things in the right direction for inspection of CV pipelines. I haven't had the time to test it past the tutorial yet though. But I think it could help in your debugging. |
That sounds like a good solution. Personally I also think exposing enough APIs to dig into all the intermediaries so that we can dump debug data from an external tool would be acceptable as well. For instance, running a step of diffusion manually should be possible using a lower level API. We can add this at a later time. I would be okay with having a cfg for enabling debug information for the time being. |
So, it has been a while, but I recalled something in particular that I was working on to make the algorithm behave more like the original, but it had some issues. I believe I reverted it, and now it uses nearest neighbor algorithm, see here: rust-cv/akaze@c8de155. Here is the original commit that adds the potentially correct behavior, though it appeared to reduce the quality of the output, but that may have been because we don't have tests that include orientation invariance: rust-cv/akaze@3252ad5. That is the original commit on the archived repository. See here for the current: https://github.com/rust-cv/cv/blob/main/akaze/src/image.rs#L118. Please give those changes a try that I had written (and then promptly reverted) and see if they improve the compatibility. |
I incorporated the change I recommended above along with @stephanemagnenat 's changes in #65. |
Thanks @vadixidav, I also had the same function for resizing (albeit implemented differently) in my not-yet-updated changes branch on my local computer (I wanted to clean-up the verbose debugging code before updating the PR branch, but maybe I should keep it updated for information, so you can freely extract pieces from it). I have other changes such as correcting the diffusion step. I will test your changes, update my branch and report here. Regarding the check you added in |
One thing regarding debugging, I believe that to ensure non-regression of this algorithm, ideally we should have points to dump intermediate images of the internal space and compared to them of the original algorithm. That is how I managed to debug so far. I do not know if rerun can be used for that. |
My further changes leading to correct internal images after There are still bugs ahead, I suspect that at least the Scharr filters implementation is broken. I continue my debugging and will update #66 with further corrections, or directly #65 if we want me to push directly there. |
The general Scharr filters were inverted! I fixed that, and simplified the code using imageproc's |
I'm personally fine with deleting unused functions if they're not public. If they're public takes a bit more of a think because of semver |
They are not public! Ok, I'll fold their deletion in the respective commits, it is done for the Scharr filters. @xd009642 for the Gaussian, they are public in the module but not re-exported at the level of the library. Clippy says they are unused. I guess it is also ok to delete them without that being a major change as they are not in the public API, right? |
Ok, with the commits in #66 so far, the internal spaces are similar up to an epsilon! And the performance with rotation is much better than before, but still not on-par with the C++ version, more bugs to hunt then: |
The C++ implementation uses 8 comparison points for This is now fixed and the Rust implementation finds the same number of keypoints as the C++ one within 1%. |
Both implementations new find keypoints at similar positions. However matching is still not on par. Next step is to compare keypoints themselves. |
Please delete unused functions, git will remember them if needed. Don't worry about them. |
This is all very great investigation. Considering how central akaze is to everything we are doing, this is really excellent for the project as a whole. Thanks so much again for your investigation. I will look once more over the PR before we merge. |
Thank you for your kind words! It's really great to have Akaze in Rust, and I hope that we can make it reliable and performant (see #67), and therefore promote the use of Rust in CV, to everyone's benefit! |
So, there was a nasty bug in computing angles: @vadixidav also at least on my test image, there are no image access out of bound using the original code, which would make ce8f7ce is not necessary. The current Rust version adds 0.5 to variables However, I think that the idea of adding an offset is not wrong. I'm not fully sure but I believe that one should add 0.5 to So, two questions: |
Thank you so much for the finding. I have a very large dataset locally that I can check to see if there are any out of bounds accesses. I did get one before when I tested, which prompted me to add the check. Maybe there is a more elegant way to solve this problem. I imagine that forcing keypoints to be 0.5 further from the edges might be sufficient. I will perform another test and see. |
Great! I also wanted to do that, but got lost in exploring some optimization... |
So, using #69 I have been running this version of Akaze on the 2017 COCO validation dataset (5k images), converted to grayscale. The bound check is necessary if we have the +0.5 on I convinced myself that the original code is correct and removed the +0.5. I also looked at OpenCV source code but their implementation differs in the sense that the evolution data layers and the window used are different. I believe that the check is not necessary any more, but I think that it is also fine to keep it. The branch predictor should make it essentially free. |
I think that PR #69 is ready to be merged. @vadixidav I let you choose whether you wish to keep or remove the boundary check you introduced in ce8f7ce. |
If you have confirmed the bounds check is no longer necessary then I will just run a test against my large dataset just in case and if it works fine then I will remove the check. |
Perfect! Can you do so and merge? |
@vadixidav, @xd009642 can we merge this please? It is in the way of more changes. @vadixidav if you do not have time to run this PR on your large dataset, may I suggest to keep the check for now, we can always remove it later once you have run it. |
So I'm personally happy to merge this with the bounds check without the run on the large dataset - but vadixidav is the boss so I'll defer to his decision. |
@stephanemagnenat Checked with vadixidav on the rust-cv discord and we're happy to merge so I've just merged it in 🎉 |
I am trying to use Akaze for an application that searches correspondences between a template and an image seen by a camera.
As a start, I have written the common pipeline building on the excellent tutorial of Rust CV. I am using Akaze with threshold 0.001, looking for two nearest neighbors, and using a distance ratio criteria of 0.8 between the first best match and the second one (best_dist < 0.8 * next_best_dist), as suggested in the Akaze OpenCV tutorial. Besides that, I am using the functions
matching
andsymmetric_matching
from the tutorial as is.For comparison, I created a similar pipeline in Python using OpenCV (4.7.0.72). When only scaling is involved, both Rust CV's Akaze and Open CV Akaze perform well:
Open CV, before matching: left: 2514 keypoints, right: 820 keypoints:
Rust CV, before matching: left: 5353 keypoints, right: 1480:
However, when I rotate the small image, Rust CV completely fails while Open CV works like a charm:
Open CV, before matching: left: 2514 keypoints, right: 1207 keypoints:
Rust CV, before matching: left 5353 keypoints, right: 2490
So, it looks to me that there is something very broken with the rotational invariance of RustCV's Akaze implementation.
Was this implementation ever validated against the OpenCV one?
I am also surprised by the different numbers of matches, as I would expect more similar ones between the two implementations.
The text was updated successfully, but these errors were encountered: