Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
tdebatty committed Jan 28, 2015
1 parent 5fd115d commit d3aeb0e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ MinHash is a hashing scheme that tents to produce similar signatures for sets th
The Jaccard similarity between two sets is the relative number of elements these sets have in common: J(A, B) = |A ∩ B| / |A ∪ B| A MinHash signature is a sequence of numbers produced by multiple hash functions hi. It can be shown that the Jaccard similarity between two sets is also the probability that this hash result is the same for the two sets: J(A, B) = Pr[hi(A) = hi(B)]. Therefore, MinHash signatures can be used to estimate Jaccard similarity between two sets. Moreover, it can be shown that the expected estimation error is O(1 / sqrt(n)), where n is the size of the signature (the number of hash functions that are used to produce the signature).


```
```java
import info.debatty.java.lsh.*;


Expand Down

0 comments on commit d3aeb0e

Please sign in to comment.