v8.1.2
- In SRC/
** add an env variable COMM_TREE_MPI_WAIT in comm_tree.c
** replace a taskloop by parallel for in pxgstrs_lsum.c - In EXAMPLE/
** drivers: only initialise cublas if GPU offloading is
enabled at runtime (James Trott)
** global interface drivers, P0 generates random Xtrue and RHS - Support 64-bit indexing for input matrix A