-
Notifications
You must be signed in to change notification settings - Fork 9
Introduction to Parallel Computing
-
Definition: Type of computation in which many calculations are performed simultaneously.
-
Basic principle: computation can be divided into smaller subproblems, each of which can be solved simultaneously.
-
Assumption: we have parallel hardware at our disposal, which is capable of executing these computations in parallel.
Parallel programming is much harder than sequential programming.
- Separating sequential computations into parallel subcomputations can be challenging, or even impossible.
- Ensuring program correctness is more difficult, due to new types of errors. Speedup is the only reason why we bother paying for this complexity.
Parallelism and concurrency are closely related concepts.
- Parallel program – uses parallel hardware to execute computation more quickly. Speedup/Efficiency is its main concern. Eg. Bigdata, algorithmic problems, etc like matrix multiplications, computer graphics, big data processing.
- Concurrent program – may or may not execute multiple executions at the same time. Modularity, responsiveness, maintainability is the main concern. Eg. Asynchronous applications like Web Servers, User interfaces, Databases, etc.
Parallelism manifests itself at different granularity levels.
-
bit-level parallelism – processing multiple bits of data in parallel. (64 bit architecture, 64 bits are processed simultaneously.
-
instruction-level parallelism – executing different instructions from the same instruction stream in parallel. Eg. In the below example, instructions b and c can be executed in parallel.
val b = a1 + a2 val c = a3 + a4 val d = b + c
-
task-level parallelism – executing separate streams of instructions in parallel.
In this course, we focus on task-level parallelism.
- Multi-core processors: contain more than one core
- Symmetric multiprocessors(SMP): contain more than 1 multi-core processors who share memory and other resources.
- General purpose Graphics Processing Unit(GPU): contains several cores which are invoked for special tasks when requested explicitly by the host processor.
- Field-programmable gate arrays: Core processor which can rewire themselves for a particular task.
- Computer clusters
Our focus will be programming for multi-cores processors and SMPs.
- week 1 – basics of parallel computing and parallel program analysis
- week 2 – task-parallelism, basic parallel algorithms
- week 3 – data-parallelism, Scala parallel collections
- week 4 – data structures for parallel computing
Week 1
- Introduction to Parallel Computing
- Parallelism on the JVM
- Running Computations in Parallel
- Monte Carlo Method to Estimate Pi
- First Class Tasks
- How fast are parallel programs?
- Benchmarking Parallel Programs
Week 2
- Parallel Sorting
- Data Operations
- Parallel map()
- Parallel fold()
- Associativity I
- Associativity II
- Parallel Scan (Prefix Sum) Operation
Week 3
- Data Parallel Programming
- Data Parallel Operations
- Scala Parallel Collections
- Splitters and Combiners
Week 4