-
Notifications
You must be signed in to change notification settings - Fork 9
Splitters and Combiners
We will study the following abstractions:
The simplified Iterator
trait is as follows:
trait Iterator[A] {
def next(): A
def hasNext: Boolean
}
def iterator: Iterator[A] // must be defined on every collection
The iterator contract:
-
next
can be called only ifhasNext
returns true. SohasNext
has to be called before usingnext
. - after
hasNext
returnsfalse
, it will always returnfalse
How would you implement foldLeft
on an iterator trait (not collection)?
def foldLeft[B](z: B)(f: (B, A) => B): B = {
var s = z
while (hasNext) s = f(s, next())
s
}
It is a counterpart of Iterator
used for parallel programming. The simplified Splitter
trait is as follows:
trait Splitter[A] extends Iterator[A] {
def split: Seq[Splitter[A]]
def remaining: Int
}
def splitter: Splitter[A] // must be defined on every parallel collection
The splitter contract:
- after calling 'split', the original splitter is left in an undefined state.
next
,hasNext
,split
cannot be called during this time. - the resulting splitters traverse disjoint subsets of the original splitter
- 'remaining' is an estimate on the number of remaining elements
- 'split' is an efficient method – 'O(log n)' or better
How would you implement fold
on a splitter?
Idea is to create a reduction tree.
def fold(z: A)(f: (A, A) => A): A = {
if (remaining < threshold) foldLeft(z)(f)
else {
val children = for (child <- split) yield task { child.fold(z)(f) }
children.map(_.join()).foldLeft(z)(f)
}
}
These are abstractions for creating new collections. The simplified 'Builder' trait is as follows:
It has two parameters A
and Repr
. A
is the type of the elements that we can add to the builder and Repr
denotes the type of the collection that the builder creates.
trait Builder[A, Repr] {
def +=(elem: A): Builder[A, Repr] // add elements to form the collection
def result: Repr // returns the collection
}
def newBuilder: Builder[A, Repr] // must be defined on every collection
// Eg: a List of Ints would return a Builder of type of Builder of Int and a List[Int]
// i.e. Builder[Int, List[Int]]
The builder contract:
- calling 'result' returns a collection of type 'Repr', containing the elements that were previously added with
+=
- calling 'result' leaves the Builder in an undefined state
How would you implement the 'filter' method using 'newBuilder'?
def filter(p: T => Boolean): Repr = {
val b = newBuilder
for (x <- this) if (p(x)) b += x
b.result
}
It is the parallel version of a Builder. The simplified 'Combiner' trait is as follows:
trait Combiner[A, Repr] extends Builder[A, Repr] {
def combine(that: Combiner[A, Repr]): Combiner[A, Repr]
}
def newCombiner: Combiner[T, Repr] // on every parallel collection
The combiner contract:
- calling 'combine' returns a new combiner that contains elements of input combiners
- calling 'combine' leaves both original Combiners in an undefined state
- 'combine' is an efficient method – 'O(log n)' or better
How would you implement a parallel 'filter' method using 'splitter' and 'newCombiner'?
Week 1
- Introduction to Parallel Computing
- Parallelism on the JVM
- Running Computations in Parallel
- Monte Carlo Method to Estimate Pi
- First Class Tasks
- How fast are parallel programs?
- Benchmarking Parallel Programs
Week 2
- Parallel Sorting
- Data Operations
- Parallel map()
- Parallel fold()
- Associativity I
- Associativity II
- Parallel Scan (Prefix Sum) Operation
Week 3
- Data Parallel Programming
- Data Parallel Operations
- Scala Parallel Collections
- Splitters and Combiners
Week 4