GitHub - alanmlewis/ml-workshop

Introduction

This workshop provides a simple introduction to machine-learned potentials and tensorial property prediction. In both cases, kernel-based Gaussian process regression is used to perform the machine learning, but the majority of the principles introduced can be straightforwardly translated to neural network models. It was originally written for the 2024 Spring School of the SFB 986. For background reading about the software packages used here, please refer to A. P. Bartok and G. Csanyi, International Journal of Quantum Chemistry 115, 1051 (2015) for gap_fit and A. Grisafi, D. M. Wilkins, G. Csanyi, and M. Ceriotti, Physical Review Letters 120, 36002 (2018) for TENSOAP.

If you encounter any problems with this workshop please report an issue through GitHub or e-mail me directly at alan.m.lewis@york.ac.uk.

Windows Users

Like many specialised software packages, the tools we will use for the workshop are not available on Windows. Therefore, users who run Windows have three options:

The first and best option is to log into a UNIX machine you have access to through your institution and complete the preparation steps listed below on that computer. Your institution probably has a guide as to how to do this using ssh in PowerShell; note that you will probably need to log in to your institution VPN to access the machine from off-campus, so please make sure that has been set up correctly.
It is possible to complete the workshop on your local Windows machine by installing Windows Subsystem for Linux. However, this may be complex to set up, and requires approximately 10GB hard disk space. This can be achieved by running wsl --install from a Powershell window opened as an administrator and following the instructions shown on the terminal. You may need to enable Virtual Machine Platform within Windows (this can be done by opening "Turn Windows features on or off" from the start menu and checking the corresponding box), and changing your BIOS settings to enable virtual environments (how this is done will depend on your hardware; check here for the hotkey to enter the BIOS when you switch your computer on). You may need to run wsl --install multiple times before installation is complete. Once you have installed WSL and have a WSL terminal open, you will also need to install various software packages, using the following commands:

sudo add-apt-repository universe sudo apt-get install git python3-pip vim export PATH=$PATH:/home/your_user_name/.local/bin

Once this is complete, continue to the preparation steps below.
Alternatively, I will provide USB sticks from which a simple installation of Ubuntu can be booted, with all of the necessary installations already complete. Try to find out how to enter the boot menu when you start your computer. This is usually done by tapping a hotkey repeatedly immediately after turn on your laptop. The specific hotkey depends on the manufacturer, but the hotkeys for a number of brands can be found on here (note that you want the Boot Menu hotkey, not the BIOS key).

Mac Users

The software we will use can run on Macs, but if you've not done much coding before you will likely need to install some features before you can begin. Alternatively, you can run the workshop on a UNIX machine you have access to through your institution; in this case see option 1 for Windows users above.

The first tool is pip, which can be installed by running

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py

from the terminal.

You will also need to install gfortran. This can be downloaded from this website, making sure you choose the appropriate version for your computer hardware and operating system. If you're not sure, try downloading this installer first. Once you have downloaded this, it can be installed in the usual way (if you need help try these instructions), and then running the gfortran application once from the Applications directory to complete the setup.

Once you have installed these two requirements, you should be able to work through the commands below successfully. Note that the first time you run a git clone command, you may be prompted to install developer tools. This is necessary, but may take around 20 minutes to complete. If you have difficulty installing gap_fit, please move on to the instructions to install TENSOAP.

Preparing for the Workshop

Clone this repository

To begin, clone this repository by running

git clone https://github.com/alanmlewis/ml-workshop

This will create a new folder called ml-workshop; change directory into that folder using cd ml-workshop. Everything else we do in the workshop will take place in this folder or a subfolder.

Creating a Python environment

This section is optional, and should definitely be skipped if you are using Windows Subsystem for Linux.

You may want to create a python environment specifically for this workshop. To do this, run

python3 -m pip install virtualenv python3 -m virtualenv ./venv

This will create a folder called venv where we will install all of the python packages needed to run the workshop. You should activate this environment by running:

source venv/bin/activate

You should also set the number of parallel threads to use, probably to 4:

export OMP_NUM_THREADS=4

Installing gap_fit

To install gap_fit, run

python3 -m pip install quippy-ase

This should install gap_fit and all of its dependencies. To test this has installed correctly, run

gap_fit config_file=gap_config.cfg

This program should take a few seconds to complete, during which time you should see a lot of text produced, with the words 'Bye Bye' displayed near the end of this text.

Installing TENSOAP

To install the dependencies needed to run TENSOAP, run

python3 -m pip install Cython sympy numpy scipy ase

To install TENSOAP itself, first run:

git clone https://github.com/alanmlewis/TENSOAP.git

This will create a new folder called TENSOAP in your workshop folder. You need to change directory into TENSOAP/soapfast, and then run

make cython

to complete the installation. Finally, to get simple access to the programs contained in TENSOAP, return the main directory for the workshop (cd ../..), and run

source TENSOAP/env.sh

To test the installation of TENSOAP, change directory into polarizability, and run the following command:

sagpr_get_PS -f ../gap_input_50.xyz -lm 0 -p -nc 200 -o PS0

This should take a few seconds, and produce four files: PS0.npy, PS0_Amat.npy, PS0_fps.npy, and PS0_natoms.npy.

Workshop Instructions

Detailed instructions for completing the full workshop are given in this pdf.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
instructions		instructions
polarizability		polarizability
README.rst		README.rst
gap_config.cfg		gap_config.cfg
gap_error_train.py		gap_error_train.py
gap_error_validate.py		gap_error_validate.py
gap_input_100.xyz		gap_input_100.xyz
gap_input_200.xyz		gap_input_200.xyz
gap_input_400.xyz		gap_input_400.xyz
gap_input_50.xyz		gap_input_50.xyz
gap_input_800.xyz		gap_input_800.xyz
gap_validate.xyz		gap_validate.xyz
md_sub.sbatch		md_sub.sbatch
molecular_dynamics.py		molecular_dynamics.py
sub.sbatch		sub.sbatch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Windows Users

Mac Users

Preparing for the Workshop

Clone this repository

Creating a Python environment

Installing gap_fit

Installing TENSOAP

Workshop Instructions

About

Uh oh!

Releases

Packages

Languages

alanmlewis/ml-workshop

Folders and files

Latest commit

History

Repository files navigation

Introduction

Windows Users

Mac Users

Preparing for the Workshop

Clone this repository

Creating a Python environment

Installing gap_fit

Installing TENSOAP

Workshop Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages