Skip to content

Validating Large Multigrid Changes

Evan Weinberg edited this page Dec 1, 2020 · 15 revisions

This page is a WIP

This page is meant to document sequences of tests that should be used to validate large changes to multigrid routines. These are arguably overkill for smaller changes.

Testing Wilson-type modifications

  • Build with -DQUDA_PRECISION=14 or 15 (quarter isn't fully functional for now)
  • This requires two separate builds: -DQUDA_FLOAT8=ON and OFF. For ON you only need to test half for the MG build/solve.
  • This test requires compiling with Wilson, clover, twisted mass, and twisted clover.
  • This test requires building with MPI or QMP to test --partition 15.
  • This test uses a reference 16^4, beta = 7.0, quenched configuration. Reach out to Evan (@weinbe2) for a copy.
  • This test only greps the iteration counts on a solve as a proxy for stability; if a line doesn't give any output that means there was an error.

The biggest bash for loop ever:

FLOAT=FLOAT4 # just cosmetic
FLOAT=FLOAT4 # just cosmetic
for r in 1 2 # one tuning run, one post-tuning run
do
  for PARTITION in " 0" 15
  do
    for MG_MMA in " true" false
    do
      for PREC in "  half" single # remove single for FLOAT8=ON
      do
        for DSLASH in "        wilson" "        clover" "  twisted-mass" twisted-clover
        do
          for OP_0 in "   direct" direct-pc
          do
            for OP_1 in "   direct" direct-pc
            do
              for COARSE_NC_1 in 24 32
              do
                for COARSE_NC_2 in 24 32
                do
                  if [ "$COARSE_NC_2" -ge "$COARSE_NC_1" ]
                  then
                    echo -n "$FLOAT $PARTITION $MG_MMA $PREC $DSLASH $OP_0 $OP_1 $COARSE_NC_1 $COARSE_NC_2 "
                    ./invert_test \
                    --prec double --prec-sloppy single --prec-null $PREC --prec-precondition $PREC \
                    --recon 12 --recon-sloppy 8 --recon-precondition 8 \
                    --dim 16 16 16 16 --gridsize 1 1 1 1 --load-gauge l16t16b7p0 --partition $PARTITION \
                    --kappa 0.1394265 --mu 0.00072 --clover-coeff 0.001 \
                    --dslash-type $DSLASH --compute-clover true --tol 1e-10 \
                    --verbosity verbose --solve-type $OP_0 --solution-type mat --inv-type gcr \
                    --inv-multigrid true --mg-levels 3 \
                    --mg-block-size 0 4 4 4 4 --mg-nvec 0 $COARSE_NC_1 \
                    --mg-block-size 1 2 2 2 2 --mg-nvec 1 $COARSE_NC_2 \
                    --mg-setup-tol 0 5e-7 --mg-setup-tol 1 5e-7 --mg-setup-inv 0 cgnr --mg-setup-inv 1 cgnr \
                    --mg-mu-factor 2 70.0 \
                    --nsrc 1 --niter 250 \
                    --mg-use-mma $MG_MMA \
                    --mg-smoother 0 ca-gcr --mg-smoother-solve-type 0 $OP_0 --mg-nu-pre 0 0 --mg-nu-post 0 4 \
                    --mg-smoother 1 ca-gcr --mg-smoother-solve-type 1 $OP_1 --mg-nu-pre 1 0 --mg-nu-post 1 4 \
                    --mg-coarse-solver 1 gcr --mg-coarse-solve-type 1 $OP_1 --mg-coarse-solver-tol 1 0.35 --mg-coarse-solver-maxiter 1 10 \
                    --mg-coarse-solver 2 gcr --mg-coarse-solve-type 2 direct-pc --mg-coarse-solver-tol 2 0.01 --mg-coarse-solver-maxiter 2 20 \
                    --mg-verbosity 0 verbose --mg-verbosity 1 verbose --mg-verbosity 2 verbose | grep "Done:"
                  fi # nc2 >= nc1
                done # COARSE_NC_2
              done # COARSE_NC_1
            done # direct, direct-pc
          done # direct, direct-pc
        done # dslash type
      done # precision
    done # mma on/off
  done # partition 0, 15
done # tuning and post-tune run

Testing multigrid_evolve_test

This test is intentionally less rigorous that the above test, under the assumption that the above test passing is sufficiently aggressive with covering all corner cases. Instead, this test uses a larger lattice, and adds some factors of 3.

WIP

Testing staggered multigrid

The (second) biggest bash for loop ever:

  • Build with -DQUDA_PRECISION=14 or 15 (quarter isn't fully functional for now)
  • This requires two separate builds: -DQUDA_FLOAT8=ON and OFF. For ON you only need to test half for the MG build/solve.
  • This test requires compiling with staggered -DQUDA_DIRAC_STAGGERED=ON.
  • This test requires building with MPI or QMP to test --partition 15.
  • This test uses a reference 24^4, beta = 7.0, quenched configuration. Reach out to Evan (@weinbe2) for a copy.
  • This test only greps the iteration counts on a solve as a proxy for stability; if a line doesn't give any output that means there was an error.
FLOAT=FLOAT4 # just cosmetic
for r in 1 2 # one tuning run, one post-tuning run
do
  for PARTITION in " 0" 15
  do
    for MG_MMA in " true" false
    do
      for PREC in "  half" single
      do
        for DSLASH in staggered "   asqtad"
        do
          for OP_2 in "   direct" direct-pc # test the level 2 coarse op, this is to validate different builds
          do
            for COARSE_NC_2 in 64 96
            do
              for COARSE_NC_3 in 64 96
              do
                if [ "$COARSE_NC_3" -ge "$COARSE_NC_2" ]
                then
                  echo -n "$FLOAT $PARTITION $MG_MMA $PREC $DSLASH $OP_2 $COARSE_NC_2 $COARSE_NC_3 "
                  tests/staggered_invert_test \
                  --prec double --prec-sloppy single --prec-null $PREC --prec-precondition $PREC \
                  --dim 24 24 24 24 --gridsize 1 1 1 1 --load-gauge l24t24b7p0 --partition $PARTITION \
                  --mass 0.01 --recon 13 --recon-sloppy 9 --recon-precondition 9 \
                  --dslash-type $DSLASH --compute-fat-long true --tadpole-coeff 0.905160183 --tol 1e-10 \
                  --verbosity verbose --solve-type direct --solution-type mat --inv-type gcr \
                  --inv-multigrid true --mg-levels 4 \
                  --mg-block-size 0 2 2 2 2 --mg-nvec 0 24 \
                  --mg-block-size 1 3 3 3 2 --mg-nvec 1 $COARSE_NC_2 \
                  --mg-block-size 2 2 2 2 3 --mg-nvec 2 $COARSE_NC_3 \
                  --mg-setup-tol 1 1e-5 --mg-setup-tol 2 1e-5 --mg-setup-inv 1 cgnr --mg-setup-inv 2 cgnr \
                  --nsrc 1 --niter 250 \
                  --mg-use-mma $MG_MMA \
                  --mg-smoother 0 ca-gcr --mg-smoother-solve-type 0 direct    --mg-nu-pre 0 0 --mg-nu-post 0 4 \
                  --mg-smoother 1 ca-gcr --mg-smoother-solve-type 1 direct-pc --mg-nu-pre 1 0 --mg-nu-post 1 4 \
                  --mg-smoother 2 ca-gcr --mg-smoother-solve-type 2 $OP_2 --mg-nu-pre 2 0 --mg-nu-post 2 4 \
                  --mg-coarse-solver 1 gcr --mg-coarse-solve-type 1 direct-pc --mg-coarse-solver-tol 1 0.35 --mg-coarse-solver-maxiter 1 10 \
                  --mg-coarse-solver 2 gcr --mg-coarse-solve-type 2 $OP_2     --mg-coarse-solver-tol 2 0.35 --mg-coarse-solver-maxiter 2 10 \
                  --mg-coarse-solver 3 gcr --mg-coarse-solve-type 3 direct-pc --mg-coarse-solver-tol 3 0.01 --mg-coarse-solver-maxiter 3 20 \
                  --mg-verbosity 0 verbose --mg-verbosity 1 verbose --mg-verbosity 2 verbose --mg-verbosity 3 verbose 2>&1 | grep "Done\|ERROR"
                fi # nc2 >= nc1
              done # COARSE_NC_2
            done # COARSE_NC_1
          done # direct, direct-pc
        done # dslash type
      done # precision
    done # mma on/off
  done # partition 0, 15
done # tuning and post-tune run
Clone this wiki locally