Skip to content

robmoore/cardest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cardinality Estimation Algorithms in Haskell

Earlier this year, I watched a talk given by Avi Bryant entitled Add ALL the things which introduced me to the HyperLogLog algorithm in particular and cardinality estimation generally. It made an impression on me along with an excellent series of blog posts by a number of folks at Neustar (formerly Aggregate Knowledge) and inspired me to implement a couple of cardinality algorithms in Haskell as a project at Hacker School.

The posts from Neustar referenced KMV (k minimum values) a number of times in discussing HLL and mentioned it was a simpler algorithm so I decided to start by implementing it first. The two represent points along the history of cardinality estimation as an area of research and make an interesting contrast in terms of approach (order statistics observables -- KMV -- vs. bit pattern observables -- HLL).

Although a number of implementations of HLL exist (including one in Haskell), I wanted to implement my own to gain a deeper understanding of how these algorithms work. In both cases it has made the difference between abstractly understanding something and concretely getting it and I recommend it to anyone who would like to know more about how they tick.

About

Cardinality estimation in Haskell.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published