-
Notifications
You must be signed in to change notification settings - Fork 8
LanceNorskog/LSH-Hadoop
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project implements a new Locality-Specific Hashing technique for nearest-neighbor problems. http://en.wikipedia.org/wiki/Locality_sensitive_hashing http://en.wikipedia.org/wiki/Nearest_neighbor_search This paper explains the theory of this implementation: Abstract: http://www.sciweavers.org/publications/locality-sensitive-hash-real-vectors Full paper: http://siam.org/proceedings/soda/2010/SODA10_094_neylont.pdf The code for the LSH algorithm is all courtesy of examples and explanation from Tyler. The project has three parts: lsh.ortho: the core implementation lsh.solr: Solr "reprocessor": download all docs, find neighbors, upload Map/reduce assistance for Solr doc m/r jobs Contains an M/R algorithm that adds corners to all documents. Both are tested with NY State cinema location data. lsh.hadoop: map/reduce wrapper. side utility: Hadoop file reader which parses input csv lines and rearranges them. The core insight is to use equilateral triangles instead of squares to round off locations. That's it.
About
Implementation of Tyler Neylon's Locality-Specific Hash based on simplex tesselations
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published