Skip to content

Grandient/frequent-itemsets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

frequent-itemsets

A Python implementation of the Apriori/PCY algorithm. Works with Python 3.6 and 3.7.

The apriori algorithm uncovers hidden structures in categorical data. The classical example is a database containing purchases from a supermarket. Every purchase has a number of items associated with it. We would like to uncover association rules such as {bread, eggs} -> {bacon} from the data. This is the goal of association rule learning, and the Apriori algorithm is arguably the most famous algorithm for this problem.

This repository contains five python scripts. It uses the retail dataset from: (http://fimi.ua.ac.be/data/retail.dat). The dependencies for these scripts is matplotlib and numpy. Each implementation runs the algorithm and graphs it after.

files

  • The first is apriori.py. This is an implementation of the apriori algortihm.
  • The second is pcy.py. This is an implementation of the PCY algorithm.
  • The third is SON.py. This is an implementation of the SON algorithm.
  • The fourth is RS.py. This is an implementation of the Random Sampling version of Apriori.
  • The fifth is graph.py. Which just runs all the implementations and graphs them on a single chart.

About

Apriori and a few other of it's implementations

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages