Skip to content

Commit

Permalink
🥭 w/ citation cff
Browse files Browse the repository at this point in the history
  • Loading branch information
wolfram77 committed Jun 24, 2022
1 parent e49f396 commit 90511ec
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 21 deletions.
10 changes: 10 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Sahu
given-names: Subhajit
orcid: https://orcid.org/0000-0001-5140-6578
title: "puzzlef/pagerank-sequential-vs-openmp: Performance of sequential execution based vs OpenMP based PageRank"
version: 1.0.0
doi: 10.5281/zenodo.6717302
date-released: 2022-06-24
28 changes: 8 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,14 @@
Performance of **sequential** execution based vs **OpenMP** based PageRank
([pull], [CSR]).
Performance of **sequential** execution based vs **OpenMP** based PageRank ([pull], [CSR]).

This experiment was for comparing the performance between:
1. Find pagerank using a single thread (**sequential**).
2. Find pagerank accelerated using **OpenMP**.

Both techniques were attempted on different types of graphs, running each
technique 5 times per graph to get a good time measure. Number of threads
for this experiment (using `OMP_NUM_THREADS`) was varied from `2` to `48`.
**OpenMP** does seem to provide a **clear benefit** for most graphs (except
for the smallest ones). This speedup is definitely not directly proportional
to the number of threads, as one would normally expect (Amdahl's law).

Note that there is still room for improvement with **OpenMP** by using
sequential versions of certain routines instead of OpenMP versions because
not all calculations benefit from multiple threads (ex.
["multiply-sequential-vs-openmp"]). Also note that neither approach makes
use of *SIMD instructions* which are available on all modern hardware.

All outputs are saved in [out](out/) and a small part of the output is listed
here. Some [charts] are also included below, generated from [sheets]. The input
data used for this experiment is available at ["graphs"] (for small ones), and
the [SuiteSparse Matrix Collection]. This experiment was done with guidance
from [Prof. Dip Sankar Banerjee] and [Prof. Kishore Kothapalli].
Both techniques were attempted on different types of graphs, running each technique 5 times per graph to get a good time measure. Number of threads for this experiment (using `OMP_NUM_THREADS`) was varied from `2` to `48`. **OpenMP** does seem to provide a **clear benefit** for most graphs (except for the smallest ones). This speedup is definitely not directly proportional to the number of threads, as one would normally expect (Amdahl's law).

Note that there is still room for improvement with **OpenMP** by using sequential versions of certain routines instead of OpenMP versions because not all calculations benefit from multiple threads (ex. ["multiply-sequential-vs-openmp"]). Also note that neither approach makes use of *SIMD instructions* which are available on all modern hardware.

All outputs are saved in [out](out/) and a small part of the output is listed here. Some [charts] are also included below, generated from [sheets]. The input data used for this experiment is available at ["graphs"] (for small ones), and the [SuiteSparse Matrix Collection]. This experiment was done with guidance from [Prof. Dip Sankar Banerjee] and [Prof. Kishore Kothapalli].

<br>

Expand Down Expand Up @@ -95,6 +81,8 @@ $ ...
<br>

[![](https://i.imgur.com/5vdxPZ3.jpg)](https://www.youtube.com/watch?v=rKv_l1RnSqs)
[![DOI](https://zenodo.org/badge/366356464.svg)](https://zenodo.org/badge/latestdoi/366356464)


[Prof. Dip Sankar Banerjee]: https://sites.google.com/site/dipsankarban/
[Prof. Kishore Kothapalli]: https://cstar.iiit.ac.in/~kkishore/
Expand Down
2 changes: 1 addition & 1 deletion system.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Dell PowerEdge R740 Rack Mount Chassis
Proc: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (48 cores x 2)
Proc: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (12 cores x 2)
Cache: L1d+i: 768KB, L2: 12MB, L3: 16MB (shared), NUMA: 2
Mem: 128GB DIMM DDR4 Synchronous Registered (Buffered) 2666 MHz (8x16GB)
Disk: MegaRAID SAS-3 3108 [Invader]; 10TB PERC H730P Adp
Expand Down

0 comments on commit 90511ec

Please sign in to comment.