Skip to content

Commit 698bdf7

Browse files
committed
PGI compiler paired with CUDA module when making / running (submission scripts updated accordingly).
1 parent 3eea415 commit 698bdf7

File tree

9 files changed

+101
-7
lines changed

9 files changed

+101
-7
lines changed

makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,12 @@ BIG_DEFINES_HYBRID_FORTRAN=-DROWS=$(BIG_GLOBAL) -DCOLUMNS_GLOBAL=$(BIG_GLOBAL) -
2626
CC=pgcc
2727
MPICC=mpicc
2828
CFLAGS=-c99 -fastsse -lm
29-
PGICFLAGS=-c99 -fastsse -acc
29+
PGICFLAGS=-c99 -fastsse -acc -ta=tesla,cuda9.2
3030

3131
FORTRANC=pgf90
3232
MPIF90=mpif90
3333
FORTRANFLAGS=-fastsse
34-
PGIFORTRANFLAGS=-fastsse -acc
34+
PGIFORTRANFLAGS=-fastsse -acc -ta=tesla,cuda9.2
3535

3636
default: quick_compile
3737

@@ -207,7 +207,7 @@ verify_modules:
207207
echo -e "\n . "; \
208208
echo -e " / \\"; \
209209
echo -e " / ! \\ It looks like the PGI compiler is not loaded."; \
210-
echo -e " /_____\\ On Bridges please issue 'module load mpi/pgi_openmpi/19.4'. You can now make again :)\n"; \
210+
echo -e " /_____\\ On Bridges please issue 'module load cuda/9.2 mpi/pgi_openmpi/19.4-nongpu'. You can now make again :)\n"; \
211211
exit -1; \
212212
fi
213213

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
Running on 8 MPI processes
2+
3+
ITERATION NUMBER | [14554,14554] | [14555,14555] | [14556,14556] | [14557,14557] | [14558,14558] | [14559,14559]
4+
-----------------+---------------+---------------+---------------+---------------+---------------+--------------
5+
ITERATION 100 | 63.71 | 73.05 | 81.75 | 89.26 | 95.06 | 98.74
6+
ITERATION 200 | 79.65 | 85.35 | 90.34 | 94.44 | 97.48 | 99.36
7+
ITERATION 300 | 85.87 | 89.94 | 93.43 | 96.24 | 98.30 | 99.57
8+
ITERATION 400 | 89.17 | 92.34 | 95.01 | 97.16 | 98.72 | 99.68
9+
ITERATION 500 | 91.22 | 93.81 | 95.98 | 97.71 | 98.97 | 99.74
10+
ITERATION 600 | 92.61 | 94.80 | 96.63 | 98.08 | 99.14 | 99.78
11+
ITERATION 700 | 93.62 | 95.52 | 97.10 | 98.35 | 99.26 | 99.81
12+
ITERATION 800 | 94.39 | 96.06 | 97.45 | 98.55 | 99.35 | 99.83
13+
ITERATION 900 | 94.99 | 96.48 | 97.73 | 98.71 | 99.42 | 99.85
14+
ITERATION 1000 | 95.47 | 96.82 | 97.95 | 98.83 | 99.47 | 99.87
15+
ITERATION 1100 | 95.87 | 97.10 | 98.13 | 98.93 | 99.52 | 99.88
16+
ITERATION 1200 | 96.20 | 97.33 | 98.28 | 99.02 | 99.56 | 99.89
17+
ITERATION 1300 | 96.48 | 97.53 | 98.41 | 99.09 | 99.59 | 99.90
18+
ITERATION 1400 | 96.72 | 97.70 | 98.52 | 99.15 | 99.62 | 99.90
19+
ITERATION 1500 | 96.93 | 97.85 | 98.61 | 99.21 | 99.64 | 99.91
20+
ITERATION 1600 | 97.12 | 97.98 | 98.69 | 99.26 | 99.66 | 99.91
21+
ITERATION 1700 | 97.28 | 98.09 | 98.77 | 99.30 | 99.68 | 99.92
22+
ITERATION 1800 | 97.43 | 98.20 | 98.83 | 99.33 | 99.70 | 99.92
23+
ITERATION 1900 | 97.56 | 98.29 | 98.89 | 99.37 | 99.71 | 99.93
24+
ITERATION 2000 | 97.67 | 98.37 | 98.94 | 99.40 | 99.73 | 99.93
25+
ITERATION 2100 | 97.78 | 98.44 | 98.99 | 99.42 | 99.74 | 99.93
26+
ITERATION 2200 | 97.88 | 98.51 | 99.04 | 99.45 | 99.75 | 99.94
27+
ITERATION 2300 | 97.96 | 98.57 | 99.08 | 99.47 | 99.76 | 99.94
28+
ITERATION 2400 | 98.05 | 98.63 | 99.11 | 99.49 | 99.77 | 99.94
29+
ITERATION 2500 | 98.12 | 98.68 | 99.15 | 99.51 | 99.78 | 99.94
30+
ITERATION 2600 | 98.19 | 98.73 | 99.18 | 99.53 | 99.78 | 99.94
31+
ITERATION 2700 | 98.25 | 98.77 | 99.21 | 99.54 | 99.79 | 99.95
32+
ITERATION 2800 | 98.31 | 98.82 | 99.23 | 99.56 | 99.80 | 99.95
33+
ITERATION 2900 | 98.37 | 98.85 | 99.26 | 99.57 | 99.81 | 99.95
34+
ITERATION 3000 | 98.42 | 98.89 | 99.28 | 99.59 | 99.81 | 99.95
35+
ITERATION 3100 | 98.47 | 98.92 | 99.30 | 99.60 | 99.82 | 99.95
36+
ITERATION 3200 | 98.51 | 98.96 | 99.32 | 99.61 | 99.82 | 99.95
37+
ITERATION 3300 | 98.56 | 98.99 | 99.34 | 99.62 | 99.83 | 99.95
38+
ITERATION 3400 | 98.60 | 99.01 | 99.36 | 99.63 | 99.83 | 99.96
39+
ITERATION 3500 | 98.63 | 99.04 | 99.38 | 99.64 | 99.84 | 99.96
40+
41+
Language used: C.
42+
Version run: hybrid_gpu_big.
43+
The maximum temperature change was reached at iteration 3586 was 0.009996974331912156
44+
Total time was 904.5 seconds.
45+
Value of halo swap verification cell [12739][14559] is 85.851525286739871490

reference_outputs/C/openacc_big.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,4 @@ ITERATION 3500 | 98.63 | 99.04 | 99.38 | 99.64
3939
Language used: C.
4040
Version run: openacc_big.
4141
The maximum temperature change was reached at iteration 3586 was 0.009996974331912156
42-
Total time was 95.1 seconds.
42+
Total time was 95.0 seconds.
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
Running on 8 MPI processes
2+
3+
ITERATION NUMBER | [14554,14554] | [14555,14555] | [14556,14556] | [14557,14557] | [14558,14558] | [14559,14559]
4+
-----------------+---------------+---------------+---------------+---------------+---------------+--------------
5+
ITERATION 100 | 63.71 | 73.05 | 81.75 | 89.26 | 95.06 | 98.74
6+
ITERATION 200 | 79.65 | 85.35 | 90.34 | 94.44 | 97.48 | 99.36
7+
ITERATION 300 | 85.87 | 89.94 | 93.43 | 96.24 | 98.30 | 99.57
8+
ITERATION 400 | 89.17 | 92.34 | 95.01 | 97.16 | 98.72 | 99.68
9+
ITERATION 500 | 91.22 | 93.81 | 95.98 | 97.71 | 98.97 | 99.74
10+
ITERATION 600 | 92.61 | 94.80 | 96.63 | 98.08 | 99.14 | 99.78
11+
ITERATION 700 | 93.62 | 95.52 | 97.10 | 98.35 | 99.26 | 99.81
12+
ITERATION 800 | 94.39 | 96.06 | 97.45 | 98.55 | 99.35 | 99.83
13+
ITERATION 900 | 94.99 | 96.48 | 97.73 | 98.71 | 99.42 | 99.85
14+
ITERATION 1000 | 95.47 | 96.82 | 97.95 | 98.83 | 99.47 | 99.87
15+
ITERATION 1100 | 95.87 | 97.10 | 98.13 | 98.93 | 99.52 | 99.88
16+
ITERATION 1200 | 96.20 | 97.33 | 98.28 | 99.02 | 99.56 | 99.89
17+
ITERATION 1300 | 96.48 | 97.53 | 98.41 | 99.09 | 99.59 | 99.90
18+
ITERATION 1400 | 96.72 | 97.70 | 98.52 | 99.15 | 99.62 | 99.90
19+
ITERATION 1500 | 96.93 | 97.85 | 98.61 | 99.21 | 99.64 | 99.91
20+
ITERATION 1600 | 97.12 | 97.98 | 98.69 | 99.26 | 99.66 | 99.91
21+
ITERATION 1700 | 97.28 | 98.09 | 98.77 | 99.30 | 99.68 | 99.92
22+
ITERATION 1800 | 97.43 | 98.20 | 98.83 | 99.33 | 99.70 | 99.92
23+
ITERATION 1900 | 97.56 | 98.29 | 98.89 | 99.37 | 99.71 | 99.93
24+
ITERATION 2000 | 97.67 | 98.37 | 98.94 | 99.40 | 99.73 | 99.93
25+
ITERATION 2100 | 97.78 | 98.44 | 98.99 | 99.42 | 99.74 | 99.93
26+
ITERATION 2200 | 97.88 | 98.51 | 99.04 | 99.45 | 99.75 | 99.94
27+
ITERATION 2300 | 97.96 | 98.57 | 99.08 | 99.47 | 99.76 | 99.94
28+
ITERATION 2400 | 98.05 | 98.63 | 99.11 | 99.49 | 99.77 | 99.94
29+
ITERATION 2500 | 98.12 | 98.68 | 99.15 | 99.51 | 99.78 | 99.94
30+
ITERATION 2600 | 98.19 | 98.73 | 99.18 | 99.53 | 99.78 | 99.94
31+
ITERATION 2700 | 98.25 | 98.77 | 99.21 | 99.54 | 99.79 | 99.95
32+
ITERATION 2800 | 98.31 | 98.82 | 99.23 | 99.56 | 99.80 | 99.95
33+
ITERATION 2900 | 98.37 | 98.85 | 99.26 | 99.57 | 99.81 | 99.95
34+
ITERATION 3000 | 98.42 | 98.89 | 99.28 | 99.59 | 99.81 | 99.95
35+
ITERATION 3100 | 98.47 | 98.92 | 99.30 | 99.60 | 99.82 | 99.95
36+
ITERATION 3200 | 98.51 | 98.96 | 99.32 | 99.61 | 99.82 | 99.95
37+
ITERATION 3300 | 98.56 | 98.99 | 99.34 | 99.62 | 99.83 | 99.95
38+
ITERATION 3400 | 98.60 | 99.01 | 99.36 | 99.63 | 99.83 | 99.96
39+
ITERATION 3500 | 98.63 | 99.04 | 99.38 | 99.64 | 99.84 | 99.96
40+
41+
Language used: FORTRAN.
42+
Version run: hybrid_gpu_big.
43+
The maximum temperature change was reached at iteration 3586 was 0.009996974375376055
44+
Total time was 940.1 seconds.
45+
Value of halo swap verification cell (14559, 12739) is 85.851525286739885701

reference_outputs/FORTRAN/openacc_big.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,4 @@ ITERATION 3500 | 98.63 | 99.04 | 99.38 | 99.64
3939
Language used: FORTRAN.
4040
Version run: openacc_big.
4141
The maximum temperature change was reached at iteration 3586 was 0.009996974418854165
42-
Total time was 85.6 seconds.
42+
Total time was 96.0 seconds.

slurm_scripts/hybrid_gpu_big.slurm

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@
33
#SBATCH --nodes=4
44
#SBATCH --partition=GPU
55
#SBATCH --ntasks-per-node 2
6-
#SBATCH --time=00:02:00
6+
#SBATCH --time=00:17:00
77
#SBATCH --gres=gpu:p100:2
88
#SBATCH -A ac560tp
99
set -x
10+
module load cuda/9.2 mpi/pgi_openmpi/19.4-nongpu;
1011
./run.sh ${1} hybrid_gpu big ${2}

slurm_scripts/hybrid_gpu_small.slurm

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@
77
#SBATCH --gres=gpu:p100:2
88
#SBATCH -A ac560tp
99
set -x
10+
module load cuda/9.2 mpi/pgi_openmpi/19.4-nongpu;
1011
./run.sh ${1} hybrid_gpu small ${2}

slurm_scripts/openacc_big.slurm

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
#!/bin/bash
22

33
#SBATCH --nodes=1
4-
#SBATCH --partition=GPU-shared
4+
#SBATCH --partition=GPU
55
#SBATCH --ntasks-per-node 1
66
#SBATCH --time=00:02:00
77
#SBATCH --gres=gpu:p100:2
88
#SBATCH -A ac560tp
99
set -x
10+
module load cuda/9.2;
1011
./run.sh ${1} openacc big ${2}

slurm_scripts/openacc_small.slurm

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@
77
#SBATCH --gres=gpu:p100:1
88
#SBATCH -A ac560tp
99
set -x
10+
module load cuda/9.2;
1011
./run.sh ${1} openacc small ${2}

0 commit comments

Comments
 (0)