GitHub - robmoore/cardest: Cardinality estimation in Haskell.

Cardinality Estimation Algorithms in Haskell

Earlier this year, I watched a talk given by Avi Bryant entitled Add ALL the things which introduced me to the HyperLogLog algorithm in particular and cardinality estimation generally. It made an impression on me along with an excellent series of blog posts by a number of folks at Neustar (formerly Aggregate Knowledge) and inspired me to implement a couple of cardinality algorithms in Haskell as a project at Hacker School.

The posts from Neustar referenced KMV (k minimum values) a number of times in discussing HLL and mentioned it was a simpler algorithm so I decided to start by implementing it first. The two represent points along the history of cardinality estimation as an area of research and make an interesting contrast in terms of approach (order statistics observables -- KMV -- vs. bit pattern observables -- HLL).

Although a number of implementations of HLL exist (including one in Haskell), I wanted to implement my own to gain a deeper understanding of how these algorithms work. In both cases it has made the difference between abstractly understanding something and concretely getting it and I recommend it to anyone who would like to know more about how they tick.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
HLL.hs		HLL.hs
KMV.hs		KMV.hs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cardinality Estimation Algorithms in Haskell

About

Releases

Packages

Languages

robmoore/cardest

Folders and files

Latest commit

History

Repository files navigation

Cardinality Estimation Algorithms in Haskell

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages