Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
d8d23a8
update cmake
QuantumMisaka Mar 8, 2025
81149e5
add aocc support
QuantumMisaka Mar 8, 2025
864ff28
update mpich
QuantumMisaka Mar 8, 2025
a235308
update VERSION
QuantumMisaka Mar 8, 2025
2e202ed
update openmpi, allow user to switch version easily
QuantumMisaka Mar 8, 2025
8a1f4ac
update elpa
QuantumMisaka Mar 8, 2025
2519fc2
create aocl script
QuantumMisaka Mar 8, 2025
f1e16c6
aocc install setup
QuantumMisaka Mar 8, 2025
4c260bb
bug fix and update readme
QuantumMisaka Mar 8, 2025
05b0095
fix openmpi switch
QuantumMisaka Mar 8, 2025
b46c726
modification
QuantumMisaka Mar 8, 2025
e57e2cc
add openmpi configure option
QuantumMisaka Mar 9, 2025
2de91a1
update elpa setting (gpu setting for 2070s)
QuantumMisaka Mar 9, 2025
2302a27
update libxc version and download
QuantumMisaka Mar 9, 2025
a7b87cf
minor update
QuantumMisaka Mar 9, 2025
e1a8303
update README
QuantumMisaka Mar 9, 2025
9d61279
Merge branch 'develop' into toolchain-202501
QuantumMisaka Mar 9, 2025
3adb9e7
minor update
QuantumMisaka Mar 9, 2025
0bda24f
minor checkout
QuantumMisaka Mar 9, 2025
b17cc34
deepmd-v3 add-in test note
QuantumMisaka Mar 10, 2025
c08c444
Merge branch 'toolchain-202501' of https://github.com/QuantumMisaka/a…
QuantumMisaka Mar 10, 2025
630453d
Merge branch 'develop' into toolchain-202501
QuantumMisaka Mar 10, 2025
2ac981d
AMD-AOCC-AOCL update and minor fixed
QuantumMisaka Mar 10, 2025
db142bf
Merge branch 'develop' into toolchain-202501
QuantumMisaka Mar 10, 2025
20c4c8d
fix bug in aocl.sh
QuantumMisaka Mar 10, 2025
bb22be4
Merge branch 'develop' into toolchain-202501
QuantumMisaka Mar 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ STRU_READIN_ADJUST.cif
build
dist
.idea
toolchain.tar.gz
time.json
*.pyc
__pycache__
Expand Down
114 changes: 81 additions & 33 deletions toolchain/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# The ABACUS Toolchain

Version 2024.3
Version 2025.1

## Author

Expand All @@ -27,12 +27,13 @@ and give setup files that you can use to compile ABACUS.
- [x] Support for [LibRI](https://github.com/abacusmodeling/LibRI) by submodule or automatic installation from github.com (but installed LibRI via `wget` seems to have some problem, please be cautious)
- [x] A mirror station by Bohrium database, which can download CEREAL, LibNPY, LibRI and LibComm by `wget` in China Internet.
- [x] Support for GPU compilation, users can add `-DUSE_CUDA=1` in builder scripts.
- [x] Support for AMD compiler and math lib `AOCL` and `AOCC` (not fully complete due to flang and AOCC-ABACUS compliation error)
- [ ] Change the downloading url from cp2k mirror to other mirror or directly downloading from official website. (doing)
- [ ] Support a JSON or YAML configuration file for toolchain, which can be easily modified by users.
- [ ] A better README and Detail markdown file.
- [ ] Automatic installation of [DEEPMD](https://github.com/deepmodeling/deepmd-kit).
- [ ] Better compliation method for ABACUS-DEEPMD and ABACUS-DEEPKS.
- [ ] Modulefile generation scripts.
- [ ] Support for AMD compiler and math lib like `AOCL` and `AOCC`


## Usage Online & Offline
Expand All @@ -49,6 +50,8 @@ There are also well-modified script to run *install_abacus_toolchain.sh* for `gn
> ./toolchain_gnu.sh
# for intel-mkl
> ./toolchain_intel.sh
# for amd aocc-aocl
> ./toolchain_amd.sh
# for intel-mkl-mpich
> ./toolchain_intel-mpich.sh
```
Expand Down Expand Up @@ -94,7 +97,7 @@ The above station will be updated handly but one should notice that the version
If one want to install ABACUS by toolchain OFFLINE,
one can manually download all the packages from [cp2k-static/download](https://www.cp2k.org/static/downloads) or official website
and put them in *build* directory by formatted name
like *fftw-3.3.10.tar.gz*, or *openmpi-5.0.5.tar.bz2*,
like *fftw-3.3.10.tar.gz*, or *openmpi-5.0.6.tar.bz2*,
then run this toolchain.
All package will be detected and installed automatically.
Also, one can install parts of packages OFFLINE and parts of packages ONLINE
Expand All @@ -109,19 +112,23 @@ just by using this toolchain

The needed dependencies version default:

- `cmake` 3.30.0
- `cmake` 3.31.2
- `gcc` 13.2.0 (which will always NOT be installed, But use system)
- `OpenMPI` 4.1.6 (5.0.5 can be used but have some problem in OpenMP parallel computation in ELPA)
- `MPICH` 4.2.2
- `OpenMPI` 5.0.6 (Version 5 OpenMPI is good but will have compability problem, user can manually downarade to Version 4 in toolchain scripts)
- `MPICH` 4.3.0
- `OpenBLAS` 0.3.28 (Intel toolchain need `get_vars.sh` tool from it)
- `ScaLAPACK` 2.2.1 (a developing version)
- `FFTW` 3.3.10
- `LibXC` 6.2.2
- `ELPA` 2024.05.001
- `LibXC` 7.0.0
- `ELPA` 2025.01.001
- `CEREAL` 1.3.2
- `RapidJSON` 1.1.0
And Intel-oneAPI need user or server manager to manually install from Intel.
[Intel-oneAPI](https://www.intel.cn/content/www/cn/zh/developer/tools/oneapi/toolkits.html)
And:
- Intel-oneAPI need user or server manager to manually install from Intel.
- - [Intel-oneAPI](https://www.intel.cn/content/www/cn/zh/developer/tools/oneapi/toolkits.html)
- AMD AOCC-AOCL need user or server manager to manually install from AMD.
- - [AOCC](https://www.amd.com/zh-cn/developer/aocc.html)
- - [AOCL](https://www.amd.com/zh-cn/developer/aocl.html)

Dependencies below are optional, which is NOT installed by default:

Expand All @@ -130,7 +137,7 @@ Dependencies below are optional, which is NOT installed by default:
- `LibRI` 0.2.0
- `LibComm` 0.1.1

Users can install them by using `--with-*=install` in toolchain*.sh, which is `no` in default.
Users can install them by using `--with-*=install` in toolchain*.sh, which is `no` in default. Also, user can specify the absolute path of the package by `--with-*=path/to/package` in toolchain*.sh to allow toolchain to use the package.
> Notice: LibRI, LibComm and Libnpy is on actively development, you should check-out the package version when using this toolchain. Also, LibRI and LibComm can be installed by github submodule, that is also work for libnpy, which is more recommended.

Users can easily compile and install dependencies of ABACUS
Expand All @@ -151,6 +158,8 @@ If compliation is successful, a message will be shown like this:
> ./build_abacus_gnu.sh
> To build ABACUS by intel-toolchain, just use:
> ./build_abacus_intel.sh
> To build ABACUS by amd-toolchain in gcc-aocl, just use:
> ./build_abacus_amd.sh
> or you can modify the builder scripts to suit your needs.
```

Expand Down Expand Up @@ -180,11 +189,70 @@ or you can also do it in a more completely way:

## Common Problems and Solutions

### LibRI and LibComm for EXX
### Intel-oneAPI problem

#### OneAPI 2025.0 problem

Generally, OneAPI 2025.0 can be useful to compile basic function of ABACUS, but one will encounter compatible problem related to something. Here is the treatment
- related to rapidjson:
- - Not to use rapidjson in your toolchain
- - or use the master branch of [RapidJSON](https://github.com/Tencent/rapidjson)
- related to LibRI: not to use LibRI or downgrade your OneAPI.

#### ELPA problem via Intel-oneAPI toolchain in AMD server

The default compiler for Intel-oneAPI is `icpx` and `icx`, which will cause problem when compling ELPA in AMD server. (Which is a problem and needed to have more check-out)

The best way is to change `icpx` to `icpc`, `icx` to `icc`. user can manually change it in *toolchain_intel.sh* via `--with-intel-classic=yes`

Notice: `icc` and `icpc` from Intel Classic Compiler of Intel-oneAPI is not supported for 2024.0 and newer version. And Intel-OneAPI 2023.2.0 can be found in QE website. You need to download Base-toolkit for MKL and HPC-toolkit for MPi and compiler for Intel-OneAPI 2023.2.0, while in Intel-OneAPI 2024.x, only the HPC-toolkit is needed.

You can get Intel-OneAPI in [QE-managed website](https://pranabdas.github.io/espresso/setup/hpc/#installing-intel-oneapi-libraries), and use this code to get Intel oneAPI Base Toolkit and HPC Toolkit:
```shell
wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/0722521a-34b5-4c41-af3f-d5d14e88248d/l_HPCKit_p_2023.2.0.49440_offline.sh
```

Related discussion here [#4976](https://github.com/deepmodeling/abacus-develop/issues/4976)

#### link problem in early 2023 version oneAPI

Sometimes Intel-oneAPI have problem to link `mpirun`,
which will always show in 2023.2.0 version of MPI in Intel-oneAPI.
Try `source /path/to/setvars.sh` or install another version of IntelMPI may help.

which is fixed in 2024.0.0 version of Intel-oneAPI,
And will not occur in Intel-MPI before 2021.10.0 (Intel-oneAPI before 2023.2.0)

More problem and possible solution can be accessed via [#2928](https://github.com/deepmodeling/abacus-develop/issues/2928)

### AMD AOCC-AOCL problem

You cannot use AOCC to complie abacus now, see [#5982](https://github.com/deepmodeling/abacus-develop/issues/5982) .

However, use AOCC-AOCL to compile dependencies is permitted and usually get boosting in ABACUS effciency. But you need to get rid of `flang` while compling ELPA. Toolchain itself help you make this `flang` shade in default, and you can manully use `flang` by setting `--with-flang=yes` in `toolchain_amd.sh` to have a try.

- GCC toolchain with OpenMPI cannot compile LibComm v0.1.1 due to the different MPI variable type from MPICH and IntelMPI, see discussion here [#5033](https://github.com/deepmodeling/abacus-develop/issues/5033), you can switch to GCC-MPICH or Intel toolchain
Notice: ABACUS via GCC-AOCL in AOCC-AOCL toolchain have no application with DeePKS, DeePMD and LibRI.

### OpenMPI problem

#### in EXX and LibRI

- GCC toolchain with OpenMPI cannot compile LibComm v0.1.1 due to the different MPI variable type from MPICH and IntelMPI, see discussion here [#5033](https://github.com/deepmodeling/abacus-develop/issues/5033), you can try use a newest branch of LibComm by
```
git clone https://gitee.com/abacus_dft/LibComm -b MPI_Type_Contiguous_Pool
```
or pull the newest master branch of LibComm
```
git clone https://github.com/abacusmodeling/LibComm
```
. yet another is switching to GCC-MPICH or Intel toolchain
- It is recommended to use Intel toolchain if one wants to include EXX feature in ABACUS, which can have much better performance and can use more than 16 threads in OpenMP parallelization to accelerate the EXX process.

#### OpenMPI-v5

OpenMPI in version 5 has huge update, lead to compatibility problem. If one wants to use the OpenMPI in version 4 (4.1.6), one can specify `--with-openmpi-4th=yes` in *toolchain_gnu.sh*

### GPU version of ABACUS

For GPU version of ABACUS (do not GPU version installer of ELPA, which is still doing work), add following options in build*.sh:
Expand Down Expand Up @@ -242,26 +310,6 @@ When you encounter problem like `GLIBCXX_3.4.29 not found`, it is sure that your

After my test, you need `gcc`>11.3.1 to enable deepmd feature in ABACUS.

### Intel-oneAPI problem

#### ELPA problem via Intel-oneAPI toolchain in AMD server

The default compiler for Intel-oneAPI is `icpx` and `icx`, which will cause problem when compling ELPA in AMD server. (Which is a problem and needed to have more check-out)

The best way is to change `icpx` to `icpc`, `icx` to `icc`. user can manually change it in toolchain*.sh via `--with-intel-classic=yes`

Notice: `icc` and `icpc` from Intel Classic Compiler of Intel-oneAPI is not supported for 2024.0 and newer version. And Intel-OneAPI 2023.2.0 can be found in website. See discussion here [#4976](https://github.com/deepmodeling/abacus-develop/issues/4976)

#### link problem in early 2023 version oneAPI

Sometimes Intel-oneAPI have problem to link `mpirun`,
which will always show in 2023.2.0 version of MPI in Intel-oneAPI.
Try `source /path/to/setvars.sh` or install another version of IntelMPI may help.

which is fixed in 2024.0.0 version of Intel-oneAPI,
And will not occur in Intel-MPI before 2021.10.0 (Intel-oneAPI before 2023.2.0)

More problem and possible solution can be accessed via [#2928](https://github.com/deepmodeling/abacus-develop/issues/2928)

## Advanced Installation Usage

Expand Down
82 changes: 82 additions & 0 deletions toolchain/build_abacus_gnu-aocl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
#!/bin/bash
#SBATCH -J build
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -o install.log
#SBATCH -e install.err
# JamesMisaka in 2025.03.09

# Build ABACUS by amd-openmpi toolchain

# module load openmpi aocc aocl

ABACUS_DIR=..
TOOL=$(pwd)
INSTALL_DIR=$TOOL/install
source $INSTALL_DIR/setup
cd $ABACUS_DIR
ABACUS_DIR=$(pwd)
#AOCLhome=/opt/aocl # user can specify this parameter

BUILD_DIR=build_abacus_gnu
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
ELPA=$INSTALL_DIR/elpa-2025.01.001/cpu
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-7.0.0
RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
# LAPACK=$AOCLhome/lib
# SCALAPACK=$AOCLhome/lib
# FFTW3=$AOCLhome
# LIBRI=$INSTALL_DIR/LibRI-0.2.1.0
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.1
# LIBTORCH=$INSTALL_DIR/libtorch-2.1.2/share/cmake/Torch
# LIBNPY=$INSTALL_DIR/libnpy-1.0.1/include
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd # v3.0 might have problem

# if clang++ have problem, switch back to g++

cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=clang++ \
-DMPI_CXX_COMPILER=mpicxx \
-DELPA_DIR=$ELPA \
-DCEREAL_INCLUDE_DIR=$CEREAL \
-DLibxc_DIR=$LIBXC \
-DENABLE_LCAO=ON \
-DENABLE_LIBXC=ON \
-DUSE_OPENMP=ON \
-DUSE_ELPA=ON \
-DENABLE_RAPIDJSON=ON \
-DRapidJSON_DIR=$RAPIDJSON \
# -DLAPACK_DIR=$LAPACK \
# -DSCALAPACK_DIR=$SCALAPACK \
# -DFFTW3_DIR=$FFTW3 \
# -DENABLE_DEEPKS=1 \
# -DTorch_DIR=$LIBTORCH \
# -Dlibnpy_INCLUDE_DIR=$LIBNPY \
# -DENABLE_LIBRI=ON \
# -DLIBRI_DIR=$LIBRI \
# -DLIBCOMM_DIR=$LIBCOMM \
# -DDeePMD_DIR=$DEEPMD \

# if one want's to include deepmd, your system gcc version should be >= 11.3.0 for glibc requirements

cmake --build $BUILD_DIR -j `nproc`
cmake --install $BUILD_DIR 2>/dev/null

# generate abacus_env.sh
cat << EOF > "${TOOL}/abacus_env.sh"
#!/bin/bash
source $INSTALL_DIR/setup
export PATH="${PREFIX}/bin":\${PATH}
EOF

# generate information
cat << EOF
========================== usage =========================
Done!
To use the installed ABACUS version
You need to source ${TOOL}/abacus_env.sh first !
"""
EOF
11 changes: 4 additions & 7 deletions toolchain/build_abacus_gnu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@
#SBATCH -n 16
#SBATCH -o install.log
#SBATCH -e install.err
# install ABACUS with libxc and deepks
# JamesMisaka in 2023.08.31
# JamesMisaka in 2025.03.09

# Build ABACUS by gnu-toolchain

Expand All @@ -24,16 +23,16 @@ rm -rf $BUILD_DIR
PREFIX=$ABACUS_DIR
LAPACK=$INSTALL_DIR/openblas-0.3.28/lib
SCALAPACK=$INSTALL_DIR/scalapack-2.2.1/lib
ELPA=$INSTALL_DIR/elpa-2024.05.001/cpu
ELPA=$INSTALL_DIR/elpa-2025.01.001/cpu
FFTW3=$INSTALL_DIR/fftw-3.3.10
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-6.2.2
LIBXC=$INSTALL_DIR/libxc-7.0.0
RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
# LIBRI=$INSTALL_DIR/LibRI-0.2.1.0
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.1
# LIBTORCH=$INSTALL_DIR/libtorch-2.1.2/share/cmake/Torch
# LIBNPY=$INSTALL_DIR/libnpy-1.0.1/include
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd # v3.0 might have problem

cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=g++ \
Expand All @@ -57,8 +56,6 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
# -DLIBRI_DIR=$LIBRI \
# -DLIBCOMM_DIR=$LIBCOMM \
# -DDeePMD_DIR=$DEEPMD \
# -DTensorFlow_DIR=$DEEPMD \


# # add mkl env for libtorch to link
# if one want to install libtorch, mkl should be load in build process
Expand Down
12 changes: 5 additions & 7 deletions toolchain/build_abacus_intel-mpich.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@
#SBATCH -n 16
#SBATCH -o install.log
#SBATCH -e install.err
# build and install ABACUS with libxc, also can with deepks and deepmd
# JamesMisaka in 2023.08.31
# JamesMisaka in 2025.03.09

# Build ABACUS by intel-toolchain with mpich

# module load mkl compiler
# source path/to/vars.sh
# source path/to/setvars.sh

ABACUS_DIR=..
TOOL=$(pwd)
Expand All @@ -23,15 +22,15 @@ BUILD_DIR=build_abacus_intel-mpich
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
ELPA=$INSTALL_DIR/elpa-2024.05.001/cpu
ELPA=$INSTALL_DIR/elpa-2025.01.001/cpu
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-6.2.2
LIBXC=$INSTALL_DIR/libx-7.0.0
RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
# LIBTORCH=$INSTALL_DIR/libtorch-2.1.2/share/cmake/Torch
# LIBNPY=$INSTALL_DIR/libnpy-1.0.1/include
# LIBRI=$INSTALL_DIR/LibRI-0.2.1.0
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.1
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd # v3.0 might have problem

cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=icpx \
Expand All @@ -53,7 +52,6 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
# -DLIBRI_DIR=$LIBRI \
# -DLIBCOMM_DIR=$LIBCOMM \
# -DDeePMD_DIR=$DEEPMD \
# -DTensorFlow_DIR=$DEEPMD \


# if one want's to include deepmd, your system gcc version should be >= 11.3.0 for glibc requirements
Expand Down
Loading
Loading