Skip to content

Insanely fast CUDA 2NN 512-bit binary descriptor matcher for computer vision

License

Notifications You must be signed in to change notification settings

komrad36/CUDAK2NN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fastest GPU implementation of a brute-force matcher for 512-bit binary descriptors in 2NN mode, i.e., a match is returned if the best match between a query vector and a training vector is more than a certain threshold number of bits better than the second-best match.

Yes, that means the DIFFERENCE in popcounts is used for thresholding, NOT the ratio. This is the CORRECT approach for binary descriptors.

This laboriously crafted kernel is EXTREMELY fast. 63 BILLION comparisons per second on a stock GTX1080, enough to match nearly 46,000 descriptors per frame at 30 fps (!)

A key insight responsible for much of the performance of this insanely fast CUDA kernel is due to Christopher Parker (https://github.com/csp256), to whom I am extremely grateful.

CUDA CC 3.0 or higher is required.

All functionality is contained in the files CUDAK2NN.h and CUDAK2NN.cu. 'main.cpp' is simply a sample test harness with example usage and performance testing.

About

Insanely fast CUDA 2NN 512-bit binary descriptor matcher for computer vision

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published