MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Go metrics for calculating string similarity and other string utility functions
Compare html similarity using structural and style metrics
A package to compute medical segmentation metrics.
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
A Clojure library for querying large data-sets on similarity
Spark functions to run popular phonetic and string matching algorithms
SetSketch: Filling the Gap between MinHash and HyperLogLog
Calculate various string metrics efficiently in Haskell
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Aim is to come up with a job recommender system, which takes the skills from LinkedIn and jobs from Indeed and throws the best jobs available for you according to your skills.
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.
This is an implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett
Easy-to-use Java library for similarity checking of strings or numeric-series
A text similarity computation using minhashing and Jaccard distance on reuters dataset
Text Matching Based on LCQMC: A Large-scale Chinese Question Matching Corpus
insight data engineering fellow project
Locality Sensitive Hashing for semantic similarity (Python 3.x)
Add a description, image, and links to the jaccard-similarity topic page so that developers can more easily learn about it.
To associate your repository with the jaccard-similarity topic, visit your repo's landing page and select "manage topics."