Skip to content

☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System

License

Notifications You must be signed in to change notification settings

yangyang14641/Parallel-Matrix-Multiplication-FOX-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parallel-Matrix-Multiplication-FOX-Algorithm

☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System

Brief Introduction to Parallel Matrix Multiplication FOX Algorithm

Basic Concepts

  • 规约计算 (Reduction)
  • 拥有者计算原则 (Owner Computing Rule)
  • 流水并行(Pipeline Parallelism):
    • 在一个进程上,矩阵计算被划分为P个阶段 (P Supercomputing Steps in a Process)
  • 数据并行 (Data Parallelism):
    • 在每个进程上同时计算局部的矩阵乘积 (Local Matrix Multiplications are computing on every processess at the same Computing Step)

Serial Matrix Multiplication:

  • Mathematical Modeling of Matrix Multiplication

    • C_{ij}=\sum_{k=0}^{K-1} A_{ik}B_{kj}; \quad (i=0,N-1), \quad (j=0,M-1)
  • Time Complexity

    • O\left ( N^{3} \right )
  • Storage Complexity

    • O\left ( N^{3} \right )
  • Example Implementation in C Language

for (i = 0; i < n; i++)                                      
        for (j = 0; j < n; j++)              
            for (k = 0; k < n; k++)
                C(i,j) = C(i,j) + A(i,k)*B(k,j);

Parallel Computing Modeling Design

  1. Basic Flow
  • Matrix \mathbf{A}'s Dimension is M \times K, and Matirx \mathbf{B}'s Dimension is a K \times N.
  • Compute Matrix \mathbf{C} = \mathbf{A}\mathbf{B} in parallel.
  • Let p=num(processors) is the number of processors, and q=\sqrt{p} be an integer such that it devides M and N.
  • Create a Cartesian topology with process mesh P_{ij}, and i=0..q-1, j=0..q-1.
  • Denote \hat{M} = \frac{M}{q}, \hat{K}=\frac{K}{q}, \hat{N}=\frac{N}{q}.
  • Distribute \mathbf{A} and \mathbf{B} by blocks on p processess such that A_{ij} is \hat{M} \times \hat{K} block and B_{ij} is \hat{K} \times \hat{N} block, stored on process P_{ij}.
  1. Details
  • Partitions of Matrices A, B and C. (Index syntax in Mathematical form: start from 1)

    • Matrix A

    • Matirx B

    • Matrix C

  • Data Distribution on the 2-D Cartesian Topology Processes Mesh (Index syntax in Mathematical form: start from 1)

About

☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •