Skip to content

Commit

Permalink
updated
Browse files Browse the repository at this point in the history
  • Loading branch information
mahmoudparsian committed Feb 7, 2021
1 parent 20f2cde commit 6a2d4ab
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion tutorial/map-partitions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,25 @@ def minmax(iterator):
if x < min:
min = x
#
return (min, max)
return [(min, max)]
#
#data = [10, 20, 3, 4, 5, 2, 2, 20, 20, 10]
#print minmax(data)
````
Then we use the minmax function for the ````mapPartitions()````:

>>> rdd = spark.sparkContext.parallelize(data, 3)
>>> mapped = rdd.mapPartitions(minmax)
>>> mapped.collect()
[(3, 20), (2, 5), (2, 20)]
>>> minmax_list = mapped.collect()
>>> minimum = min(minmax_list[0])
>>> minimum
3
>>> maximum = max(minmax_list[0])
>>> maximum
20

````
### NOTE: data can be huge, but for understanding
### the mapPartitions() we use a very small data set
Expand Down

0 comments on commit 6a2d4ab

Please sign in to comment.