Skip to content

astro-ML/21cm-wrapper

Repository files navigation

Leaf wrapper for 21cmFAST and more v1.0
is supposed to provide an easy, high-level interface written to
minimize pain when doing doing simulations or other things with 21cmFAST.
It also provides the user with a fast multiprocessing interface for large-scale
databases with the option to also control low-level parameters.
alt text

Quickstart
(* - required for cluster deployment):

Run install.sh or alternatively follow the step-by-step guide below.

1*) Create an new virtual environment
python -m venv ./21cm-wrapper-penv

2*) Activate the environment
source ./21cm-wrapper-penv/bin/activate

3) Install the requirements
pip install -r requirements.txt && pip install ./21cmFAST -e

4) Get familiar with the wrapper by open tutorial.ipynb in your jupyter notebook
You can import the classes and load all dependencies by running
from Leaf import *

5) Give initialization parameter which are fixed for all simulations or edit the parameter.yaml file to your liking.


Data generation on cluster
For running the code on a cluster using PBS one needs to create two files:

  1. A python script which will be executed. For running simulations with uniform sampled parameters
    the script may look like this:
#import the wrapper
from Leaf import *

# To set the parameter ranges, the parameter must be given as a dict, following the dict
# structure defined in parameter.yaml. The values must be an array which is handed over
# to the samplef defined above. In this case it is [start, stop].
user_parameter = {
    "HII_DIM": 140,
    "BOX_LEN": 200,
    "N_THREADS": 1,
    "USE_INTERPOLATION_TABLES": True,
    "PERTURB_ON_HIGH_RES": True
}
flag_options = {
    "INHOMO_RECO": True,
    "USE_TS_FLUCT": True
}

astro_params = {
    "INHOMO_RECO": True
}

redshift = 5.5

# Initialize the wrapper with the above given parameters,
# for more details check the docs or the tutorial.ipynb.
sim = Leaf(user_params=user_parameter, flag_options=flag_options, astro_params=astro_params, debug=True, redshift=redshift)

astro_params_range = {
        "HII_EFF_FACTOR":[10,250],
        "L_X":[38,42],
        "NU_X_THRESH":[100,1500],
        "ION_Tvir_MIN":[4.0,5.3]
}

cosmo_params_range = {"OMm":[0.2,0.4]}


# Set the number of simulations with quantity and the number of threads (= #cores on the cluster)
if __name__ == '__main__':
    sim.run_lcsampling(samplef=sim.uniform, save=True, threads=28, quantity=2000,
                    astro_params_range=astro_params_range, cosmo_params_range=cosmo_params_range)
# That's it!

And we save it as simulator.py.

  1. A bash script which is queued using the
    qsub
    command. It may look like this:
#!/bin/bash
#PBS -l nodes=1:ppn=32:bigmemlong
#PBS -q bigmemlong
#PBS -l mem=80gb
#PBS -M YOUR_MAIL -m ae
#PBS -l walltime=RUNTIME
cd PATH_TO_YOUR_FOLDER
source ./21cm-wrapper-penv/bin/activate
python ./simulation.py

We save it as run_simulation.sh.

Now can run it via
qsub run_simulation.sh .


MCMC (THIS IS STILL EXPERIMENTAL)

For affine-invariant ensemble sampling using emcee,
a starting script may look like this:

# load class
from mcmc import *

# set first sampling range
init_params_ranges = {"astro_params": {"ION_Tvir_MIN": [4,5.3], "HII_EFF_FACTOR": [5, 100], 
                    "L_X": [30,50], "NU_X_THRESH": [100, 1500]}}

# define the log-prior
def log_prior(theta):
        T_vir, H_eff, LX, nu_x = theta
        if 5 < H_eff < 100 and 100 < nu_x < 1500 and 4 < T_vir < 5.3 and 30 < LX < 50:
            return 0
        return - np.inf

#define the log-likelihood
def log_likelihood(t,f):
    return - np.sum((t - f)**2/f)

# initialize sampler, choose nwalker twice the number of available cpu-cores
# for best performance
mcrun = mcmc(nwalker=8, z_bins = 11, debug = True,
	 log_likelihood=log_likelihood, prior=log_prior)
	 
# generate or load fiducial model
mcrun.make_fiducial(load = False)

# run the sampler
mcrun.run_aies(init_params_ranges)

The program can be queued as above.

For doing nested sampling via pymultinest,
a starting script may look like this:

# load class
from mcmc import *

# set first sampling range
init_params_ranges = {"astro_params": {"ION_Tvir_MIN": [4,5.3], "HII_EFF_FACTOR": [5, 100], 
                    "L_X": [30,50], "NU_X_THRESH": [100, 1500]}}

# input: init cube for parameter [0,1)
# pymultinest samples parameter from the prior, not probabilities
def prior_sampling(theta):
        T_vir, H_eff, LX, nu_x = theta
        T_vir = 4 + 1.3*T_vir
        H_eff = 5 + 95*H_eff
        LX = 30 + 20*LX
        nu_x = 100 + 1400*nu_x
        return np.array([T_vir, H_eff, LX, nu_x])

# define the log-likelihood
def log_likelihood(t,f):
    return - np.sum((t - f)**2/f)

# initialize sampler
mcrun = mcmc(z_bins = 11, debug = True, log_likelihood=log_likelihood, prior=prior_sampling)

# generate or load fiducial model
mcrun.make_fiducial(load = True)

# run the sampler
mcrun.run_ns(init_params_ranges)

pymultinest requires multiprocessing via MPI
To run the script multithreaded, change the line in your run.sh to

mpiexec -n NUMBER-OF-CPU-CORES python main.py



About

An attempt to code a simple wrapper for p21cmfast.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages