From 90511ec7ae940186f5c2789c31e5750404b21f78 Mon Sep 17 00:00:00 2001 From: Subhajit Sahu Date: Fri, 24 Jun 2022 09:07:34 +0530 Subject: [PATCH] :mango: w/ citation cff --- CITATION.cff | 10 ++++++++++ README.md | 28 ++++++++-------------------- system.txt | 2 +- 3 files changed, 19 insertions(+), 21 deletions(-) create mode 100644 CITATION.cff diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 0000000..f193086 --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,10 @@ +cff-version: 1.2.0 +message: "If you use this software, please cite it as below." +authors: + - family-names: Sahu + given-names: Subhajit + orcid: https://orcid.org/0000-0001-5140-6578 +title: "puzzlef/pagerank-sequential-vs-openmp: Performance of sequential execution based vs OpenMP based PageRank" +version: 1.0.0 +doi: 10.5281/zenodo.6717302 +date-released: 2022-06-24 diff --git a/README.md b/README.md index 9059871..adf9562 100644 --- a/README.md +++ b/README.md @@ -1,28 +1,14 @@ -Performance of **sequential** execution based vs **OpenMP** based PageRank -([pull], [CSR]). +Performance of **sequential** execution based vs **OpenMP** based PageRank ([pull], [CSR]). This experiment was for comparing the performance between: 1. Find pagerank using a single thread (**sequential**). 2. Find pagerank accelerated using **OpenMP**. -Both techniques were attempted on different types of graphs, running each -technique 5 times per graph to get a good time measure. Number of threads -for this experiment (using `OMP_NUM_THREADS`) was varied from `2` to `48`. -**OpenMP** does seem to provide a **clear benefit** for most graphs (except -for the smallest ones). This speedup is definitely not directly proportional -to the number of threads, as one would normally expect (Amdahl's law). - -Note that there is still room for improvement with **OpenMP** by using -sequential versions of certain routines instead of OpenMP versions because -not all calculations benefit from multiple threads (ex. -["multiply-sequential-vs-openmp"]). Also note that neither approach makes -use of *SIMD instructions* which are available on all modern hardware. - -All outputs are saved in [out](out/) and a small part of the output is listed -here. Some [charts] are also included below, generated from [sheets]. The input -data used for this experiment is available at ["graphs"] (for small ones), and -the [SuiteSparse Matrix Collection]. This experiment was done with guidance -from [Prof. Dip Sankar Banerjee] and [Prof. Kishore Kothapalli]. +Both techniques were attempted on different types of graphs, running each technique 5 times per graph to get a good time measure. Number of threads for this experiment (using `OMP_NUM_THREADS`) was varied from `2` to `48`. **OpenMP** does seem to provide a **clear benefit** for most graphs (except for the smallest ones). This speedup is definitely not directly proportional to the number of threads, as one would normally expect (Amdahl's law). + +Note that there is still room for improvement with **OpenMP** by using sequential versions of certain routines instead of OpenMP versions because not all calculations benefit from multiple threads (ex. ["multiply-sequential-vs-openmp"]). Also note that neither approach makes use of *SIMD instructions* which are available on all modern hardware. + +All outputs are saved in [out](out/) and a small part of the output is listed here. Some [charts] are also included below, generated from [sheets]. The input data used for this experiment is available at ["graphs"] (for small ones), and the [SuiteSparse Matrix Collection]. This experiment was done with guidance from [Prof. Dip Sankar Banerjee] and [Prof. Kishore Kothapalli].
@@ -95,6 +81,8 @@ $ ...
[![](https://i.imgur.com/5vdxPZ3.jpg)](https://www.youtube.com/watch?v=rKv_l1RnSqs) +[![DOI](https://zenodo.org/badge/366356464.svg)](https://zenodo.org/badge/latestdoi/366356464) + [Prof. Dip Sankar Banerjee]: https://sites.google.com/site/dipsankarban/ [Prof. Kishore Kothapalli]: https://cstar.iiit.ac.in/~kkishore/ diff --git a/system.txt b/system.txt index 6235256..0b86fe8 100644 --- a/system.txt +++ b/system.txt @@ -1,5 +1,5 @@ Dell PowerEdge R740 Rack Mount Chassis -Proc: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (48 cores x 2) +Proc: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (12 cores x 2) Cache: L1d+i: 768KB, L2: 12MB, L3: 16MB (shared), NUMA: 2 Mem: 128GB DIMM DDR4 Synchronous Registered (Buffered) 2666 MHz (8x16GB) Disk: MegaRAID SAS-3 3108 [Invader]; 10TB PERC H730P Adp