-
Notifications
You must be signed in to change notification settings - Fork 100
Running some internal tests (blas, dslash, invert)
As mentioned elsewhere QUDA comes with a large number of internal tests. These are often run as part of CI and the entire test suite can be kicked off via
make test
after a build is complete. When tracking down bugs, or portability failures, one can also
run the tests directly via ctest
using e.g.:
ctest --output-on-failure
which will trigger output for failing tests.
CTest can also be used to zoom in on a failing test. E.g. if tests 5 and 6 fail one can just get CTest to execute those, with verbosity enabled
ctest -V -I 5,6
The BLAS tests test all the BLAS routines including the multi-blas. Useful arguments are:
-
--prec
(which can be e.g.quarter
,half
,single
ordouble
) depending on the build. -
--verbosity
(e.g.summarize
,verbose
,debug
) which control test verbosity -
--gtest_list_tests
-- sinceblas_test
has many tests, we can list them all in case we want to focus in on some -
--gtest_filter=<Testname>
-- where<Testname>
corresponds to a name of the test -
--sdim <SPATIAL> --tdim <TEMPORAL> --Lsdim <FIFTH>
corresponding to the lattice spatial (X,Y,Z), temporal (T), and 5th dimension (for 5D chiral fermions) respectively
If no arguments are given, blas_test
will execute all of its tests for all of its compiled precisions. If a precision is not compiled (e.g. quarter) its tests will be skipped, denoted by e.g.:
[ RUN ] QUDA/BlasTest.verify/caxpbypzYmbwcDotProductUYNormY_quarter_double
[ SKIPPED ] QUDA/BlasTest.verify/caxpbypzYmbwcDotProductUYNormY_quarter_double (0 ms)
in the test output.
- To run all the correctness, and performance tests for all enabled precisions
./blas_test
- To run all the correctness, and performance tests for single precision:
./blas_test --prec single
- To run all the the performance tests for half precision:
./blas_test --prec half --gtest_filter=QUDA/BlasTest.benchmark/*
- To run a specific test with full debug output (e.g. of all the tuning and kernel launch parameters) on a lattice of size 32x32x32x64 sites:
./blas_test --gtest_filter=QUDA/BlasTest.verify/cDotProductNormA_single_single --verbosity debug --sdim 32 --tdim 64 --Lsdim 1
After the blas_test
the next nontrivial test is the Dslash test which can be used to test the various Fermion Matrix operators. There are two test programs for Dslash, one is dslash_test
for Wilson Like fermions (Wilson, clover, twisted mass etc), and staggered_dslash_test
. In order to build the staggered tests one must enable the MILC interface during build time (-DQUDA_MILC_INTERFACE=ON
)
The dslash operators in this sense are full fermion operators (rather than just the derivative piece) and can have Mass parameters, preconditioning styles etc. They can also feature gauge compression (e.g. storing the gauge links using 18,12, or 8 real numbers for wilson like operators).
Some example invocations:
- single checkerboard Wilson dslash operator, in single precision, with default 4D lattice sizes
./dslash_test --prec single --Lsdim 1
- single checkerboard Clover dslash operator (AD), in single precision, with default 4D Lattice sizes, computing the clover term on the GPU:
./dslash_test --prec single --Lsdim 1 --dslash-type clover --compute-clover 1
- single checkerboard Wilson Dslash with half precision and 8-compression, on a lattice with 32x32x32x32 sites:
./dslash_test --prec half --dslash-type wilson --sdim 32 --tdim 32 --Lsdim 1 --recon 8
The next level of complexity is to run a full solver. QUDA features invert_test
staggered_invert_test
, again for Wilson-like and Staggered-like operators. One now is faced with several extra parameters:
-
--solve-type
defines what kind of solve to perform with what kind of preconditioned linear operator. For example, a direct-solver like BiCGStab would use eitherdirect
ordirect-pc
indicating that it was solving with the regular linear operator whether it is solving on the full solution, or whether it is solving on a checkerboard. A solver like CG may opt for a normal operator usingnormop
ornormop-pc
. -
--solution-type
determines which system to solve, e.g.-
mat
-- solve the system A x = b where A is the unpreconditioned matrix. If the solve type isdirect-pc
ornormop-pc
this would indicate solving the Schur preconditioned system under the hood but reconstructing the solution. -
mat-pc
-- solve the system A_p x = b where A_p is the Schur preconditioned matrix. Presumably needs--solve-type
being eithernormop-pc
ordirect-pc
. -
mat-dag-mat
-- solve the system A^\dagger A x = b, in other words with the normal operator, as one would do in a fermion matrix calculation - there are many more combinations.
-
-
--matpc
-- selects the checkerboard on which the solve is to be done (e.g.odd-odd
) -
--prec-sloppy
-- for solves with mixed precision, selects the 'lower' precision (e.g. in inner solves) -
--recon-sloppy
-- for solves with mixed precision, selects the gauge field compression strategy (e.g. in inner solves) -
--reliable-delta
-- for solves using mixed precision andreliable updates
the delta parameter to use for reliable updating ( e.g. 0.1 in half precision, or 0.001 in single).
Of course there are many more parameters one can turn to, this is just the tip of the iceberg for simple manual testing.
- Uniform precision (double-prec) conjugate gradients with solve for normal equations
./invert_test --sdim 32 --tdim 64 --Lsdim 1 \
--dslash-type clover --mass 0 --mass-normalization mass --clover-coeff 1.0 \
--prec double --prec-sloppy double \
--inv-type cgne --solution-type mat --solve-type normop-pc \
--tol 1.0e-14 --verbosity verbose
- Mixed half-double precision BiCGStab. Use a reliable restart cofficient of 0.1. Use 8-real numbers to store the sloppy gauge fields, and use 12 numbers to store the precise ones.
./invert_test --sdim 32 --tdim 64 --Lsdim 1 \
--dslash-type clover --mass 0 --mass-normalization mass --clover-coeff 1.0 \
--prec double --prec-sloppy half --recon 12 --recon-sloppy 8 \
--inv-type bicgstab --solution-type mat --solve-type direct-pc \
--tol 1.0e-14 --reliable-delta 0.1 --verbosity verbose