Repository for projects in the Chalmers course "TIF345 / FYM345 Advanced simulation and machine learning" 2020.
Authors: Sebastian Holmin and Erik Andersson
The code in this repository was used to produce the following four reports:
In this project, we use utilize the Supernova Cosmology Project (SCP) data to analyze and compare cosmological models. The SCP 2.1 dataset contains detailed measurements of theredshift, z, and distance moduli μ, of several supernovea, which we utilize to perform Bayesianparameter estimations and model comparisons.
In this report we investigate the issue of parameter selection and estimation in cluster expansions of alloys. To do this we use the icet
package which implements symmetry transformations to expand the mixing energy of alloy structures into
where is the number of a -clusters per atom and is the effective cluster interaction (ECI), which are the parameters that we seek to estimate from energy data.
Assuming i.i.d. errors this can be written in matrix notation as , with and thus the likelihood function is given by
The Bayesian and Akaike information criteria are defined at the maximum likelihood, which can be shown to be equivalent to
where MSE is the mean squared error.
In this project we investigate the use of Gaussian Processes (GP) to model the potential energy surface (PES) for adding a Au atom to a Au slab, i.e. the difference in average energy per atom between the slab with and without the extra atom. To sample the energy we use an embedded medium theory (EMT) calculator provided in the asap3
package. Sampling the energy this way is resource intensive, so GPs are likely well suited method for reducing the computational time for modeling the PES.
In this report we investigate the use of the approximate Bayesian computation (ABC) algorithm, supported by a neural network (NN), to reverse engineer the parameters for a toy model of a Galton board on a rocking ship.
A Galton board (bean machine) is a device that produces a normal distribution by utilizing the law of big numbers. It consists of rows of pegs where balls can roll a step to the left or the right at each row. Our toy model has 31 rows, giving 32 possible end positions for each ball, and two parameters and . The parameter describes a simplified moment of inertia, i.e. the tendency for a ball to continue rolling in the same direction again for the next peg, while describes the incline of the rocking ship that the Galton board is situated on. The probability of a ball rolling to the right is then given by
where and and if the ball previously rolled to the left and if it rolled to the right.
We are faced with the task of determining an unknown (but constant) from a 'black box' function that simulates the final positions of 1000 balls. This is done for an unknown, randomly chosen, latent variable . To help us with this task we will implement our own simulator where we can control the parameters and analyse the behaviour.
Phrased in a Bayesian language, given a (set of) simulated distributions , find the posterior distribution
where we have first used the law of total probability to write the likelihood as marginalized over the latent variable and then Bayes theorem to write the posterior in terms of the likelihood. As this is a toy model, we will assume that is drawn uniformly in its allowed interval for every run of the 'black box' simulation, and that the 'true' was chosen randomly from uniform distribution. That is, we will assume that the priors and are uniform.
The accuracy of the posterior can be increased using the results from several experiment outcomes. Given a set , for , the total posterior is given by
where we have use the fact that is uniform to insert the posterior.