-
Notifications
You must be signed in to change notification settings - Fork 49
NERSC Perlmutter
There are several ways to use simsopt on Perlmutter. If you plan to use simsopt in your own driver scripts without editing simsopt itself, you can either use the Shifter container (typically the easiest approach) or install a pre-compiled binary package. If however you plan to edit simsopt itself, you should install from source.
NERSC's documentation describes several approaches for using python. "Option 3" is discussed in the Shifter section below, and "Option 2" will be used in the later sections. For the later sections we will use the conda
package manager to install many of the required packages.
These instructions were current as of January 28 2022.
Shifter is a "container" technology that allows you to use simsopt, VMEC, and SPEC at NERSC without compiling any code. Shifter was developed at NERSC to circumvent the security issues associated with Docker containers. Shifter allows to you use the simsopt Docker image files hosted on Docker Hub.
Shifter converts Docker images and virtual machines into a common format. After connecting to a NERSC login node check for the simsopt shifter images:
shifterimg images | grep simsopt
You should see multiple images similar to
hiddensymmetries/simsopt:v0.7.0
. If the version you are interested
in is not available, you can pull it by running
shifterimg -v pull docker:hiddensymmetries/simsopt:<version_no>
where <version_no>
is the version of your choice, which is
referred to as tag in docker parlance. Once the image is pulled, the
corresponding shifter image is made available to all users at NERSC.
The master
branch has the tag latest
. The image shown by
shifterimg images
may be stale because the master branch is always
changing. Always re-pull the image if you want to use master
branch, but keep in mind the results may not be reproducible. For
reproducible data, users are strongly encouraged to use a container
with specific version number.
Simsopt is installed inside a python virtual environment within the
simsopt Docker/Shifter container. On entry, the Docker container automatically
activates the python virtual environment. However, the Shifter
container does not run entrypoint commands unless explicitly told, so
the virtual environment is not activated. The full path for the python
executable installed inside the virtual environment
/venv/bin/python
has to be used.
One can run Shifter on login nodes for small serial jobs. To run a simsopt python driver script (located in your usual filesystem), you can type
shifter --image=docker:hiddensymmetries/simsopt:latest /venv/bin/python <script_name>
You can also run the simsopt Shifter container interactively, with
shifter --image=docker:hiddensymmetries/simsopt:latest /venv/bin/python
to enter the python interpreter, or
shifter --image=docker:hiddensymmetries/simsopt:latest /bin/bash
for a shell. In the latter scenario, even though you enter the container, the prompt may not change. To check if you are inside the simsopt Shifter container, you can run
cat /etc/lsb-release
The output should show DISTRIB_ID=Ubuntu
along with some other lines.
Please do not abuse the interactive capability by running large scale jobs on login nodes.
The main reason for using Shifter is to run simsopt in parallel with multiple MPI processes on NERSC. Here is an example script for submitting a slurm job using the simsopt Shifter container:
#!/bin/bash
#SBATCH --qos=debug
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=32
#SBATCH --constraint=cpu
#SBATCH --image=hiddensymmetries/simsopt:latest
srun shifter /venv/bin/python simsopt_driver
where simsopt_driver
can be replaced with the name of your driver script.
To use the simsopt Shifter container in an interactive session on the compute nodes, first run
salloc --constraint=cpu -N 1 -p debug --image=hiddensymmetries/simsopt:latest -t 00:30:00
In the above command, the image
option is passed to the slurm
commands directly. The --constraint=cpu
option means we want
to run our job on cpu nodes (rather than gpu nodes) on perlmutter. The -N 1
option specifies
that we want one node, -p debug
indicates the debug queue, and
-t 00:30:00
specifies 30 minutes of allocation time for this job.
After some time, resources are allocated and you can run your jobs. If
you have navigated to a clone of the simsopt repository, you
can run the one of the examples as
srun -n 4 shifter /venv/bin/python examples/1_Simple/tracing_fieldline.py
One perlmutter cpu node has 128 cores, so you can use any number up to 128 in place of 4 in the above command. You can also run the parallel unit tests by entering
srun -n 4 shifter /venv/bin/python -m unittest discover -v -k mpi -s tests
The remainder of this document discusses "Option 2" for using python at NERSC, based on a conda
virtual environment.
First, load one of the python/3.x
modules, e.g. module load python
. When writing up these instructions, the module python/3.8-anaconda-2020.11
was used.
Next, create a conda virtual environment using
conda create -n 20220112-01-simsoptFromConda
conda activate 20220112-01-simsoptFromConda
Here, 20220112-01-simsoptFromConda
is a name we are giving to the virtual environment, and you can replace this string with another name of your choice if you like.
Conda can install packages from several "channels". We want to have conda use the default
channel with highest priority, then use the conda-forge
channel with lower priority. To add conda-forge
with lower priority, enter
conda config --append channels conda-forge
To confirm the channels that conda will use and their order of priority, enter conda info
. The result should look like this:
active environment : 20220112-01-simsoptFromConda
active env location : /global/homes/l/landrema/.conda/envs/20220112-01-simsoptFromConda
shell level : 1
user config file : /global/homes/l/landrema/.condarc
populated config files : /global/homes/l/landrema/.condarc
conda version : 4.9.2
conda-build version : 3.20.5
python version : 3.8.5.final.0
virtual packages : __glibc=2.26=0
__unix=0=0
__archspec=1=x86_64
base environment : /usr/common/software/python/3.8-anaconda-2020.11 (read only)
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
package cache : /usr/common/software/python/3.8-anaconda-2020.11/pkgs
/global/homes/l/landrema/.conda/pkgs
envs directories : /global/homes/l/landrema/.conda/envs
/usr/common/software/python/3.8-anaconda-2020.11/envs
platform : linux-64
user-agent : conda/4.9.2 requests/2.24.0 CPython/3.8.5 Linux/4.12.14-150.75-default sles/15 glibc/2.26
UID:GID : 43298:43298
netrc file : /global/homes/l/landrema/.netrc
offline mode : False
Note that in the channel URLs
section, the repo.anaconda.com
lines (which represent the default
channel) appear above the conda.anaconda.org/conda-forge
lines, indicating that conda-forge
has lower priority. If you have previously added the conda-forge
channel with higher priority than default
, you can remove it using conda config --remove channels conda-forge
before conda config --append channels conda-forge
We can install simsopt using
conda install -c hiddensymmetries simsopt
You will be asked by conda to confirm that you want to proceed - press enter. Installation will take a minute or so.
Simsopt should now be installed. You can confirm using
python -c "import simsopt; print(simsopt.__version__, 'Success')"
If you do not need VMEC and MPI (for instance if you are doing stage-2 coil optimization), then you can stop here.
If you wish to edit and develop the simsopt source code, you should install simsopt from source. To do this, we first install some packages that simsopt depends on:
conda install python numpy scipy cmake ninja pybind11 jax jaxlib scikit-build matplotlib monty nptyping Deprecated randomgen ruamel.yaml sympy h5py f90nml pyevtk setuptools_scm
Press enter when you are asked if you wish to proceed; installation will take about a minute. Now, navigate to where you wish to install the simsopt
repository (e.g. your home directory) and then clone the repository using
git clone https://github.com/hiddenSymmetries/simsopt.git
Enter the directory with cd simsopt
. Now we compile and install the code using
pip install -e .
This performs an "editable" install, so any changes you make to the python source are immediately reflected when you import simsopt from any directory. However, if you make changes to the C++ source, you must re-run pip install -e .
before those changes take effect. Note that the compiled code is put in the build
directory, so if you wish to do a clean build, you can delete this directory with rm -r build
before running pip install -e .
.
If you do not need VMEC and MPI (for instance if you are doing stage-2 coil optimization), then you can stop here.
To use MPI with simsopt, we must build mpi4py
using the system's MPI. To do this, run
env CC=cc MPICC=cc pip install --no-cache-dir mpi4py
(The --no-cache-dir
option is usually unnecessary, but it ensures that a clean build is performed in case any temporary files are left from previous unsuccessful build attempts.)
Note that on Perlmutter, simsopt modules that use MPI can only be imported in python from a compute node (either via a batch script or using srun
in an interactive session), not from a login node. The reason is that Perlmutter does not allow MPI to be initialized from a login node. If you try, python will exit with an error like this:
[Thu Dec 30 06:30:08 2021] [unknown] Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(537):
MPID_Init(246).......: channel initialization failed
MPID_Init(647).......: PMI2 init failed: 1
Aborted
To check simsopt components that use MPI, you can start an interactive job using
salloc --nodes 1 --qos interactive --time 00:05:00 --constraint knl
and, once the interactive session begins, try something like the following:
srun python -c "import simsopt.util.mpi; print('success')"
If you wish to use VMEC with simsopt, you must install the python-wrapped VMEC from source. (A pre-compiled VMEC module is not yet available.) To do this, we need a netcdf
module loaded; either the serial or parallel version should work. We also need a cmake
module. It is also necessary to unload the default module craype-hugepages2M
, which is known to cause problems for python. For the instructions here, the following module commands were used (in addition to the earlier module load python
):
module unload craype-hugepages2M
module load cray-netcdf-hdf5parallel cmake
which resulted in the following modules being loaded:
Currently Loaded Modulefiles:
1) modules/3.2.11.4 13) xpmem/2.2.20-7.0.1.1_4.28__g0475745.ari
2) altd/2.0 14) job/2.2.4-7.0.1.1_3.55__g36b56f4.ari
3) darshan/3.2.1 15) dvs/2.12_2.2.167-7.0.1.1_17.11__ge473d3a2
4) craype-network-aries 16) alps/6.6.58-7.0.1.1_6.30__g437d88db.ari
5) intel/19.0.3.199 17) rca/2.2.20-7.0.1.1_4.74__g8e3fb5b.ari
6) craype/2.6.2 18) atp/2.1.3
7) cray-libsci/19.06.1 19) PrgEnv-intel/6.0.5
8) udreg/2.3.2-7.0.1.1_3.61__g8175d3d.ari 20) craype-haswell
9) ugni/6.0.14.0-7.0.1.1_7.63__ge78e5b0.ari 21) cray-mpich/7.7.10
10) pmi/5.0.14 22) python/3.8-anaconda-2020.11
11) dmapp/7.1.1-7.0.1.1_4.72__g38cf134.ari 23) cray-netcdf-hdf5parallel/4.6.3.2
12) gni-headers/5.0.12.0-7.0.1.1_6.46__g3b1768f.ari 24) cmake/3.21.3
Next, you must install a few additional packages that the vmec module depends on:
pip install scikit-build f90wrap ninja
(It is better to install f90wrap
with pip
than conda
since if it is installed with conda
, lower performance blas/lapack libraries are used for every package in the conda environment, such as numpy.) Next, navigate to where you wish to install the VMEC2000 repository (e.g. your home directory) and then clone the repository using
git clone https://github.com/hiddenSymmetries/VMEC2000.git
Change into the VMEC2000
directory. Copy the file cmake/machines/cori.json
on top of the file cmake_config_file.json
, replacing it. The file cmake_config_file.json
should now read
{
"cmake_args": [
"-DNETCDF_INC_PATH=/opt/cray/pe/netcdf-hdf5parallel/4.6.3.2/INTEL/19.0/include",
"-DNETCDF_LIB_PATH=/opt/cray/pe/netcdf-hdf5parallel/4.6.3.2/INTEL/19.0/lib",
"-DSCALAPACK_LIB_DIR=/opt/cray/pe/libsci/19.06.1/INTEL/16.0/x86_64/lib",
"-DSCALAPACK_LIB_NAME=sci_intel_mpi",
"-DCMAKE_C_COMPILER=cc",
"-DCMAKE_CXX_COMPILER=CC",
"-DCMAKE_Fortran_COMPILER=ftn"
]
}
Now you can build and install the vmec python module by running
python setup.py install
If any problems arise during compilation, it is recommended to run rm _skbuild
to delete temporary files from earlier unsuccessful installation attempts.
If you wish to use the booz_xform
module, it can be installed using
pip install --no-cache-dir booz_xform
If you get an error resembling
break adjusted to free malloc space: 0x0000010000000000 ***
this means the craype-hugepages2M
is loaded, which interferes with the vmec python module. Run module unload craype-hugepages2M
and try again.