The purpose of pyspark-tutorial is to provide basic algorithms using PySpark. Note that PySpark is an interactive shell for basic testing and debugging and is not supposed to be used for production environment.
- wordcount: classic word count
- bigrams: find frequency of bigrams
- basic-join: basic join of two relations R(K, V1), S(K,V2)
- basic-map: basic mapping of RDD elements
- basic-add: how to add all RDD elements together
- basic-multiply: how to multiply all RDD elements together