-
Notifications
You must be signed in to change notification settings - Fork 109
Validating Large Multigrid Changes
Evan Weinberg edited this page Dec 1, 2020
·
15 revisions
This page is a WIP
This page is meant to document sequences of tests that should be used to validate large changes to multigrid routines. These are arguably overkill for smaller changes.
- Build with
-DQUDA_PRECISION=14
or15
(quarter isn't fully functional for now) - This requires two separate builds:
-DQUDA_FLOAT8=ON
andOFF
. ForON
you only need to testhalf
for the MG build/solve. - This test requires compiling with Wilson, clover, twisted mass, and twisted clover.
- This test requires building with MPI or QMP to test
--partition 15
. - This test uses a reference 16^4, beta = 7.0, quenched configuration. Reach out to Evan (@weinbe2) for a copy.
- This test only greps the iteration counts on a solve as a proxy for stability; if a line doesn't give any output that means there was an error.
The biggest bash for loop ever:
FLOAT=FLOAT4 # just cosmetic
FLOAT=FLOAT4 # just cosmetic
for r in 1 2 # one tuning run, one post-tuning run
do
for PARTITION in " 0" 15
do
for MG_MMA in " true" false
do
for PREC in " half" single # remove single for FLOAT8=ON
do
for DSLASH in " wilson" " clover" " twisted-mass" twisted-clover
do
for OP_0 in " direct" direct-pc
do
for OP_1 in " direct" direct-pc
do
for COARSE_NC_1 in 24 32
do
for COARSE_NC_2 in 24 32
do
if [ "$COARSE_NC_2" -ge "$COARSE_NC_1" ]
then
echo -n "$FLOAT $PARTITION $MG_MMA $PREC $DSLASH $OP_0 $OP_1 $COARSE_NC_1 $COARSE_NC_2 "
./invert_test \
--prec double --prec-sloppy single --prec-null $PREC --prec-precondition $PREC \
--recon 12 --recon-sloppy 8 --recon-precondition 8 \
--dim 16 16 16 16 --gridsize 1 1 1 1 --load-gauge l16t16b7p0 --partition $PARTITION \
--kappa 0.1394265 --mu 0.00072 --clover-coeff 0.001 \
--dslash-type $DSLASH --compute-clover true --tol 1e-10 \
--verbosity verbose --solve-type $OP_0 --solution-type mat --inv-type gcr \
--inv-multigrid true --mg-levels 3 \
--mg-block-size 0 4 4 4 4 --mg-nvec 0 $COARSE_NC_1 \
--mg-block-size 1 2 2 2 2 --mg-nvec 1 $COARSE_NC_2 \
--mg-setup-tol 0 5e-7 --mg-setup-tol 1 5e-7 --mg-setup-inv 0 cgnr --mg-setup-inv 1 cgnr \
--mg-mu-factor 2 70.0 \
--nsrc 1 --niter 250 \
--mg-use-mma $MG_MMA \
--mg-smoother 0 ca-gcr --mg-smoother-solve-type 0 $OP_0 --mg-nu-pre 0 0 --mg-nu-post 0 4 \
--mg-smoother 1 ca-gcr --mg-smoother-solve-type 1 $OP_1 --mg-nu-pre 1 0 --mg-nu-post 1 4 \
--mg-coarse-solver 1 gcr --mg-coarse-solve-type 1 $OP_1 --mg-coarse-solver-tol 1 0.35 --mg-coarse-solver-maxiter 1 10 \
--mg-coarse-solver 2 gcr --mg-coarse-solve-type 2 direct-pc --mg-coarse-solver-tol 2 0.01 --mg-coarse-solver-maxiter 2 20 \
--mg-verbosity 0 verbose --mg-verbosity 1 verbose --mg-verbosity 2 verbose | grep "Done:"
fi # nc2 >= nc1
done # COARSE_NC_2
done # COARSE_NC_1
done # direct, direct-pc
done # direct, direct-pc
done # dslash type
done # precision
done # mma on/off
done # partition 0, 15
done # tuning and post-tune run
This test is intentionally less rigorous that the above test, under the assumption that the above test passing is sufficiently aggressive with covering all corner cases. Instead, this test uses a larger lattice, and adds some factors of 3.
WIP
The (second) biggest bash for loop ever:
- Build with
-DQUDA_PRECISION=14
or15
(quarter isn't fully functional for now) - This requires two separate builds:
-DQUDA_FLOAT8=ON
andOFF
. ForON
you only need to testhalf
for the MG build/solve. - This test requires compiling with staggered
-DQUDA_DIRAC_STAGGERED=ON
. - This test requires building with MPI or QMP to test
--partition 15
. - This test uses a reference 24^4, beta = 7.0, quenched configuration. Reach out to Evan (@weinbe2) for a copy.
- This test only greps the iteration counts on a solve as a proxy for stability; if a line doesn't give any output that means there was an error.
FLOAT=FLOAT4 # just cosmetic
for r in 1 2 # one tuning run, one post-tuning run
do
for PARTITION in " 0" 15
do
for MG_MMA in " true" false
do
for PREC in " half" single
do
for DSLASH in staggered " asqtad"
do
for OP_2 in " direct" direct-pc # test the level 2 coarse op, this is to validate different builds
do
for COARSE_NC_2 in 64 96
do
for COARSE_NC_3 in 64 96
do
if [ "$COARSE_NC_3" -ge "$COARSE_NC_2" ]
then
echo -n "$FLOAT $PARTITION $MG_MMA $PREC $DSLASH $OP_2 $COARSE_NC_2 $COARSE_NC_3 "
tests/staggered_invert_test \
--prec double --prec-sloppy single --prec-null $PREC --prec-precondition $PREC \
--dim 24 24 24 24 --gridsize 1 1 1 1 --load-gauge l24t24b7p0 --partition $PARTITION \
--mass 0.01 --recon 13 --recon-sloppy 9 --recon-precondition 9 \
--dslash-type $DSLASH --compute-fat-long true --tadpole-coeff 0.905160183 --tol 1e-10 \
--verbosity verbose --solve-type direct --solution-type mat --inv-type gcr \
--inv-multigrid true --mg-levels 4 \
--mg-block-size 0 2 2 2 2 --mg-nvec 0 24 \
--mg-block-size 1 3 3 3 2 --mg-nvec 1 $COARSE_NC_2 \
--mg-block-size 2 2 2 2 3 --mg-nvec 2 $COARSE_NC_3 \
--mg-setup-tol 1 1e-5 --mg-setup-tol 2 1e-5 --mg-setup-inv 1 cgnr --mg-setup-inv 2 cgnr \
--nsrc 1 --niter 250 \
--mg-use-mma $MG_MMA \
--mg-smoother 0 ca-gcr --mg-smoother-solve-type 0 direct --mg-nu-pre 0 0 --mg-nu-post 0 4 \
--mg-smoother 1 ca-gcr --mg-smoother-solve-type 1 direct-pc --mg-nu-pre 1 0 --mg-nu-post 1 4 \
--mg-smoother 2 ca-gcr --mg-smoother-solve-type 2 $OP_2 --mg-nu-pre 2 0 --mg-nu-post 2 4 \
--mg-coarse-solver 1 gcr --mg-coarse-solve-type 1 direct-pc --mg-coarse-solver-tol 1 0.35 --mg-coarse-solver-maxiter 1 10 \
--mg-coarse-solver 2 gcr --mg-coarse-solve-type 2 $OP_2 --mg-coarse-solver-tol 2 0.35 --mg-coarse-solver-maxiter 2 10 \
--mg-coarse-solver 3 gcr --mg-coarse-solve-type 3 direct-pc --mg-coarse-solver-tol 3 0.01 --mg-coarse-solver-maxiter 3 20 \
--mg-verbosity 0 verbose --mg-verbosity 1 verbose --mg-verbosity 2 verbose --mg-verbosity 3 verbose 2>&1 | grep "Done\|ERROR"
fi # nc2 >= nc1
done # COARSE_NC_2
done # COARSE_NC_1
done # direct, direct-pc
done # dslash type
done # precision
done # mma on/off
done # partition 0, 15
done # tuning and post-tune run