Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add searching for similar pictures #61

Closed
qarmin opened this issue Oct 11, 2020 · 2 comments · Fixed by #69
Closed

Add searching for similar pictures #61

qarmin opened this issue Oct 11, 2020 · 2 comments · Fixed by #69
Labels
enhancement New feature or request

Comments

@qarmin
Copy link
Owner

qarmin commented Oct 11, 2020

This is implemented in dupeguru

@qarmin qarmin added the enhancement New feature or request label Oct 11, 2020
@Syfaro
Copy link
Contributor

Syfaro commented Oct 12, 2020

Some thoughts about doing this, coming from a person that runs a similar image matching service with 10s of millions of images.

  • You can use perceptual image hashing to generate hashes to compare
    • There's the img_hash crate to do this in Rust
    • Perceptual hashes are very resistant to compression and resizing, but are completely incapable of handling cropping
  • Hashes can be compared with hamming distance
    • If there's few enough images, you could compare each hash against every other hash, but this scales very poorly
    • You can use a bk-tree to index all of the hashes and very quickly find similar images
    • Indexing 23,543,939 images with 64-bit hashes is using 3.4GB of RAM for my service
  • I personally find that with 64-bit hashes a hamming distance of up to 3 indicates a high probability of it being the same image

@qarmin
Copy link
Owner Author

qarmin commented Oct 15, 2020

Implemented by #69, but for me it is still a little too slow

@qarmin qarmin closed this as completed Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants