Learn the pyspark API through pictures and simple examples
# flatMap
x = sc.parallelize([1,2,3])
y = x.flatMap(lambda x: (x, 100*x, x**2))
print(x.collect())
print(y.collect())
[1, 2, 3]
[1, 100, 1, 2, 200, 4, 3, 300, 9]
Contributors are welcome
Original images are here, download to pdf, convert to svg with: genSVD (pdf2svg)