-
Notifications
You must be signed in to change notification settings - Fork 9
Conc Trees Data Structure
In this lecture, we will study the conc data type, which is a parallel counterpart to functional cons list ([https://github.com/rohitvg/scala-principles-1/wiki/Polymorphism-(-Subtyping-and-Generics-)#basics](We have seen this before)) and is used to manipulate data. This will reveal a data structure with an efficient concatenation method.
Let’s recall the list data type in functional programming.
sealed trait List[+T] {
def head: T
def tail: List[T]
}
case class ::[T](head: T, tail: List[T]) extends List[T]
case object Nil extends List[Nothing] {
def head = sys.error(”empty list”)
def tail = sys.error(”empty list”)
}
How do we implement a filter method on lists?
def filter[T](list: List[T])(p: T => Boolean): List[T] = list match {
case x :: xs if p(x) => x :: filter(xs)(p)
case x :: xs => filter(xs)(p)
case Nil => Nil
}
Lists are built for sequential computations due to their recursive nature – they are traversed from left to right.
Trees allow parallel computations – their subtrees can be traversed in parallel.
sealed trait Tree[+T]
case class Node[T](left: Tree[T], right: Tree[T]) extends Tree[T]
case class Leaf[T](elem: T) extends Tree[T]
case object Empty extends Tree[Nothing]
Node of T will represent internal nodes, which contain only two references to the left and the right subtree, but no concrete elements. instead elements will be contained in the leaf data type. The elements themselves are contained in the leaf, and the leaves are bound together with the node objects. (Eg. Image over here)
How do we implement a filter method on trees?
def filter[T](t: Tree[T])(p: T => Boolean): Tree[T] = t match {
case Node(left, right) => Node(parallel(filter(left)(p), filter(right)(p)))
case Leaf(elem) => if (p(elem)) t else Empty
case Empty => Empty
}
Suppose we filter for odd elements using the predicate filter(_%2 = 1)
, apply it on tree in figure 1, then we get figure 2. i.e. tree with empty elements. We can reduce that to figure 3 by removing empty nodes. But unfortunately this tree is not suitable for parallel computations, as it is more like a list than a tree.
Thus, Trees are not good for parallelism unless they are balanced.
Thus, Trees are not good for parallelism unless they are balanced.
Let’s devise a data type called Conc, which represents balanced trees:
sealed trait Conc[+T] {
def level: Int // height of the tree
def size: Int // no. of elements
def left: Conc[T]
def right: Conc[T]
}
In parallel programming, this data type is known as the conc-list (introduced in the Fortress language).
Here we see one of the implementation of this list.
Concrete implementations of the Conc
data type:
case object Empty extends Conc[Nothing] {
def level = 0
def size = 0
}
class Single[T](val x: T) extends Conc[T] {
def level = 0
def size = 1
}
case class <>[T](left: Conc[T], right: Conc[T]) extends Conc[T] {
val level = 1 + math.max(left.level, right.level)
val size = left.size + right.size
}
In addition, we will define the following invariants for Conc-trees so that they are balanced:
- A
<>
node can never containEmpty
as its subtree. - The level difference between the left and the right subtree of a
<>
node is always 1 or less.
We will rely on these invariants to implement concatenation:
def <>(that: Conc[T]): Conc[T] = { // conc constructor
if (this == Empty) that
else if (that == Empty) this
else concat(this, that) // re organizes tree so that the invariants are maintained
}
Concatenation needs to consider several cases.
First, the two trees could have height difference 1 or less:
def concat[T](xs: Conc[T], ys: Conc[T]): Conc[T] = {
val diff = ys.level - xs.level
if (diff >= -1 && diff <= 1) new <>(xs, ys)
else if (diff < -1) {
//...
Otherwise, let’s assume that the left tree is higher than the right one.
Case 1: The left tree is left-leaning.
Recursively concatenate the right subtree.
if (xs.left.level >= xs.right.level) {
val nr = concat(xs.right, ys)
new <>(xs.left, nr)
} else {
// ... contd
** Case 2**: The left tree is right-leaning.
// ... contd from above
} else {
val nrr = concat(xs.right.right, ys)
if (nrr.level == xs.level - 3) {
val nl = xs.left
val nr = new <>(xs.right.left, nrr)
new <>(nl, nr)
} else {
val nl = new <>(xs.left, xs.right.left)
val nr = nrr
new <>(nl, nr)
}
}
Question: What is the complexity of <>
method?
- O(log n)
- O(h1 − h2)
- O(n)
- O(1)
Answer: Concatenation takes O(h1 − h2)
time, where h1 and h2 are the heights of the two trees.
In the next lecture, we will see how the core of this data structure can be used to implement more efficient combiners.
Week 1
- Introduction to Parallel Computing
- Parallelism on the JVM
- Running Computations in Parallel
- Monte Carlo Method to Estimate Pi
- First Class Tasks
- How fast are parallel programs?
- Benchmarking Parallel Programs
Week 2
- Parallel Sorting
- Data Operations
- Parallel map()
- Parallel fold()
- Associativity I
- Associativity II
- Parallel Scan (Prefix Sum) Operation
Week 3
- Data Parallel Programming
- Data Parallel Operations
- Scala Parallel Collections
- Splitters and Combiners
Week 4