This course reviews high-performance computing systems from both aspects of hardware and software. The course talks about the importance of parallel processing, parallel system architectures, parallel algorithm design, parallel programming, and performance evaluation methodologies. Class materials are uploaded to Google Classroom, and students will do programming practice with their own PCs.
Class (hpc_01) #1
Notes about the first class: README.md
- Class Schedule
- Introduction to HPC
- Supercomputers and Parallel Computing
- The Modern Scientific Method
- Target Applications
- Floating-Point Operations (FLOPS)
- Parallel Computing
- Parallel Architectures
- Power Calculation Example
- System Overview
- Software Overview
- Benefits of Parallel Computing
- Importance of Application Parallelism
- Seeking Concurrency
- Parallelism in Graphs
- Parallelization Ratio and Amdahl’s Law
- Parallel Programming
- Accelerators and Heterogeneous Computing
- System Performance Metrics
- Heterogeneous Computing
- Latency-Oriented Features
- Cache Memory
- Branch Prediction
- Strategies for High Throughput
- Conclusion
Class (hpc_02) #2
Notes about the second class: README.md
- Today's Topics: Parallel Computers
- Dependencies in Parallel Computing
- Parallel Loops
- Loop Dependence
- Compiler Optimization
- SIMD Overview
- Vector Processors
- Hardware Trends for Vectorization
- Code optimization for NEC SX-AT
- Shared-memory Computer
- UMA Architecture
- Shared Data
- What is a Cache Memory?
- The principle of locality
- Message Collision
- NUMA Architecture
- Switch Networks
- Cache coherence in NUMA
- Distributed-Memory Computers
- Hybrid Systems
- Performance Metrics
- Network Topology
- Summary
Class (hpc_03)
Notes about the second class: README.md
- SIMD Overview
- Sharing a Memory Space
- Basic Network Topology
- Basic Network Topology (continued)
- Parallel Algorithm Design
- Today's Topics
- How to Run a Program on HPC?
- Why Job Submission is Needed
- Job Script File
- Job Submission
- Job Scheduling
- Workflows
- Technical Challenges
- Various Workloads
- Task/Channel Model (Ian Foster, 1995)
- Synchronous vs Asynchronous Communication
- Execution Time
- Foster’s Design Methodology
- Finding the Sum
- Parallel Reduction Design
- n-Body Problem
- Other Communication Patterns
- Summary
Class (hpc_04)
- Parallel Computers and Systems
- Software and Resources
- Parallel Algorithm Design
- Running Programs on HPC
- Introduction to MPI Programming
- MPI Programming Basics
- MPI Functions and Communication Patterns
- Collective Communication
- Deadlocks and Synchronization
- Performance Measurement
- Exercise - Monte Carlo π Calculation
- Summary of MPI Programming Concepts
- Sample Codes
Class (hpc_05)
- Introduction to High-Performance Computing (HPC)
- Parallel Programming Fundamentals
- Core Concepts in Parallel Processing with OpenMP
- Data Management in OpenMP
- Task Parallelism
- Heterogeneous Computing and Offloading
- Performance and Scalability Considerations
- Optimization Techniques