Skip to content

Commit

Permalink
New wave-functions implementation (electronic-structure#747)
Browse files Browse the repository at this point in the history
* WIP: external order of G-vectors

* WIP: external order of G-vectors

* experiment with external G-vector order

* Initialize G-vectors with a predefined set

* external G-vector order needs testing

* fixes for external G-vector order

* external G-vector order needs testing

* WIP: new wave functions impl.

* need to test H|psi> again

* need to work on a standalone test first

* remove G+k parameters

* try QE order

* remove 2nd template parameter of apply_preconditioner()

* small fixes

* small fixes

* small fixes

* WIP: create row- and col- G+k vector sets

* pass gkvec_row_ and gkvec_col_ to Matching_coefficients

* do not pass igk__ index to beta-projectors class

* remove debug out

* WIP: new wave functions impl.

* remove commented code

* introduce base class for wave-functions

* Fix/beta projectors for exact diag (electronic-structure#742)

* WIP: create row- and col- G+k vector sets

* pass gkvec_row_ and gkvec_col_ to Matching_coefficients

* do not pass igk__ index to beta-projectors class

* remove debug out

* fix typo

* draft of the new wave-functions class

* draft of the new wave-functions class

* try default move constructor

* small fix

* do not use c++17 features

* need to remove  in another PR first

* need to remove 'using namespace' in another PR first

* fixes

* fiex

* weird namespace error

* fixes

* use costa to swap fv eigen vectors

* move new WFs to wf:: space

* move new WFs to wf:: space

* move to a separate function

* remove commented code

* adopt split_in_block() in the new apply_fv_h_o

* introduce Wave_functions_mt and more strong types

* wip: generate fv wave-functions using new wf:: impl.

* wip: transform new wave-functions

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* add test25 for NiO lda+u

* Temporary commit to test inner product on GPUs

TODO for testing:
 * inner product
 * tranformation
 * orthogonalisation
 * CG solver
 * application of local Hamiltonian
 * davidson solver

other TODO:
 * restore commented tests
 * restore GPU branch

* cleanup

* fix vebosity level

* fix

* cleanup

* fix in the assert check

* remove matrix_distribution_t enum; it is no longer needed

* cleanup

* cleanup

* update test

* update test ortho

* fix in wf::orthogonalize()

* ready to test hloc

* ready to test hloc

* ready to test hloc

* fixes

* remove template from device_memory_guard

* ready to test Davidson solver on GPUs

* use wf::inner() for <beta|phi> inner product

* enable initialize_subspace() on GPU

* fix for dmatrix::copy_to

* add inline

* move new impl. of Davidson to GPU

* add ifdef guards

* restoring GPU code

* compute muffin-tin checksum on GPU

* restoring iterative lapw solver on GPU

* restoring lapw code on GPU

* include utils/rte.hpp

* fix axpby for zero alpha or beta

* GPU and GPU parallel tests pass

* restore custom swap of wave functions

* all tests pass

* working on the basic documentation

* temporary commit

* use global memory pool

* fix

* remove unused file

* make sure memory pool cannot be duplicated in compilation units

* restore and cleanup tests

* fix eigensolvers mem.pools

* fix

* update nlcglib interface

- add Wave_function_base::pw_coeffs(spin_index)
- api changes transferred to nlcg code

* strong_type::operator T()

- make std::vector<>::operator[spin_index] etc work without get

* nlcglib: fix assert statement

* restore and cleanup tests

* restore and cleanup tests

* try magma ci/cd workflow

* add short cut for checking if T is real, add missing includes

* simplify template for wf::inner

- types in wf::inner used in beta_projectors_base.hpp can be automatically deduced

* clean memory pools

* restore fp32/fp64 innner() and transform() functions

* remove unused files

* remove unused header

* minor fixes

* fix for magma ci/cd

* one more fix

* cleanup

* cleanup

* restore function

Co-authored-by: Simon Pintarelli <simon.pintarelli@cscs.ch>
  • Loading branch information
toxa81 and simonpintarelli authored Nov 4, 2022
1 parent 32b66f7 commit 139e2e3
Show file tree
Hide file tree
Showing 155 changed files with 73,158 additions and 10,986 deletions.
20 changes: 20 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,26 @@ jobs:
cd ${GITHUB_WORKSPACE}/build
spack --color always -e gcc-build-env build-env $SPEC_GCC -- make
build_magma:
runs-on: ubuntu-latest
container: electronicstructure/sirius
env:
SPEC_GCC: sirius@develop %gcc +tests +apps +vdwxc +scalapack +fortran +magma +cuda cuda_arch=60 build_type=RelWithDebInfo ^openblas ^mpich ^magma +cuda cuda_arch=60
steps:
- uses: actions/checkout@v2
- name: Show the spec
run: spack --color always -e gcc-build-env spec -I $SPEC_GCC
- name: Configure SIRIUS
run: |
cd ${GITHUB_WORKSPACE}
mkdir build
cd build
spack --color always -e gcc-build-env build-env $SPEC_GCC -- cmake .. -DUSE_SCALAPACK=1 -DUSE_MAGMA=1 -DUSE_CUDA=1 -DUSE_VDWXC=1 -DBUILD_TESTING=1 -DCREATE_FORTRAN_BINDINGS=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo
- name: Build SIRIUS
run: |
cd ${GITHUB_WORKSPACE}/build
spack --color always -e gcc-build-env build-env $SPEC_GCC -- make
build_cuda_fp32:
runs-on: ubuntu-latest
container: electronicstructure/sirius
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.18)

project(SIRIUS VERSION 7.3.2)
project(SIRIUS VERSION 7.4.0)

# user variables
set(CREATE_PYTHON_MODULE OFF CACHE BOOL "create sirius Python module")
Expand Down
2 changes: 1 addition & 1 deletion apps/nlcg/sirius.nlcg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ double ground_state(Simulation_context& ctx,
throw std::runtime_error("invalid smearing type given");
}

if (is_device_memory(ctx.preferred_memory_t())) {
if (is_device_memory(ctx.processing_unit_memory_t())) {
switch (pu) {
case sddk::device_t::GPU: {
std::cout << "nlcg executing on gpu-gpu" << "\n";
Expand Down
11 changes: 6 additions & 5 deletions apps/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
set(_tests "test_hdf5;test_allgather;\
set(_tests "test_alloc;test_hdf5;test_allgather;\
read_atom;test_mdarray;test_xc;test_hloc;\
test_mpi_grid;test_enu;test_eigen;test_gemm;test_gemm2;test_wf_inner_v3;test_wf_inner;test_memop;\
test_mem_pool;test_mem_alloc;test_examples;test_wf_inner_v4;test_bcast_v2;test_p2p_cyclic;\
test_wf_ortho_6;test_mixer;test_davidson;test_lapw_xc;test_phase;test_bessel;test_fp;test_pppw_xc;\
test_exc_vxc;test_atomic_orbital_index;test_sym;test_blacs;test_reduce;test_comm_split;test_wf_trans")
test_mpi_grid;test_enu;test_eigen;test_gemm;test_gemm2;test_wf_inner;test_memop;\
test_mem_pool;test_mem_alloc;test_examples;test_bcast_v2;test_p2p_cyclic;\
test_wf_ortho;test_mixer;test_davidson;test_lapw_xc;test_phase;test_bessel;test_fp;test_pppw_xc;\
test_exc_vxc;test_atomic_orbital_index;test_sym;test_blacs;test_reduce;test_comm_split;test_wf_trans;\
test_wf_fft")

foreach(_test ${_tests})
add_executable(${_test} ${_test}.cpp)
Expand Down
43 changes: 19 additions & 24 deletions apps/tests/test_alloc.cpp
Original file line number Diff line number Diff line change
@@ -1,32 +1,27 @@
#include <sirius.h>
#include <sirius.hpp>

using namespace sirius;

template <int touch, int pin, processing_unit_t pu>
template <int touch, int pin, sddk::device_t pu>
void test_alloc(int size__)
{
runtime::Timer t("alloc");
if (pu == CPU) {
mdarray<char, 1> a(1024 * 1024 * size__);
#ifdef SIRIUS_GPU
if (pin) {
a.pin_memory();
}
#endif
auto t0 = utils::time_now();
if (pu == sddk::device_t::CPU) {
sddk::mdarray<char, 1> a(1024 * 1024 * size__, pin ? sddk::memory_t::host_pinned : sddk::memory_t::host);
if (touch) {
a.zero();
}
}
#ifdef SIRIUS_GPU
if (pu == GPU) {
mdarray<char, 1> a(nullptr, 1024 * 1024 * size__);
a.allocate_on_device();
#if defined(SIRIUS_GPU)
if (pu == sddk::device_t::GPU) {
sddk::mdarray<char, 1> a(nullptr, 1024 * 1024 * size__);
a.allocate(sddk::memory_t::device);
if (touch) {
a.zero_on_device();
a.zero(sddk::memory_t::device);
}
}
#endif
double tval = t.stop();
double tval = utils::time_interval(t0);
printf("time: %f microseconds\n", tval * 1e6);
printf("effective speed: %f GB/sec.\n", size__ / 1024.0 / tval);
}
Expand All @@ -44,18 +39,18 @@ int main(int argn, char** argv)

sirius::initialize(1);
printf("--- allocate on host, don't pin, don't touch\n");
test_alloc<0, 0, CPU>(1024);
test_alloc<0, 0, sddk::device_t::CPU>(1024);
printf("--- allocate on host, don't pin, touch\n");
test_alloc<1, 0, CPU>(1024);
test_alloc<1, 0, sddk::device_t::CPU>(1024);
printf("--- allocate on host, pin, don't touch\n");
test_alloc<0, 1, CPU>(1024);
test_alloc<0, 1, sddk::device_t::CPU>(1024);
printf("--- allocate on host, pin, touch\n");
test_alloc<1, 1, CPU>(1024);
#ifdef SIRIUS_GPU
test_alloc<1, 1, sddk::device_t::CPU>(1024);
#if defined(SIRIUS_GPU)
printf("--- allocate on device, don't touch\n");
test_alloc<0, 0, GPU>(512);
test_alloc<0, 0, sddk::device_t::GPU>(512);
printf("--- allocate on device, touch\n");
test_alloc<1, 0, GPU>(512);
#endif
test_alloc<1, 0, sddk::device_t::GPU>(512);
#endif
sirius::finalize();
}
13 changes: 5 additions & 8 deletions apps/tests/test_blacs.cpp
Original file line number Diff line number Diff line change
@@ -1,21 +1,18 @@
#include <sirius.hpp>

using namespace sirius;
using namespace sddk;

int main(int argn, char** argv)
{
sirius::initialize(true);

#if defined(SIRIUS_SCALAPACK)
std::cout << Communicator::self().size() << " " << Communicator::self().rank() << std::endl;
std::cout << Communicator::world().size() << " " << Communicator::world().rank() << std::endl;
std::cout << sddk::Communicator::self().size() << " " << sddk::Communicator::self().rank() << std::endl;
std::cout << sddk::Communicator::world().size() << " " << sddk::Communicator::world().rank() << std::endl;

auto blacs_handler = linalg_base::create_blacs_handler(Communicator::self().mpi_comm());
blacs_handler = linalg_base::create_blacs_handler(Communicator::world().mpi_comm());
auto blacs_handler = sddk::linalg_base::create_blacs_handler(sddk::Communicator::self().mpi_comm());
blacs_handler = sddk::linalg_base::create_blacs_handler(sddk::Communicator::world().mpi_comm());
std::cout << blacs_handler << std::endl;

sirius::finalize(true);
#endif

return 0;
}
Loading

0 comments on commit 139e2e3

Please sign in to comment.