This repository hosts simulation code for generating XMM-Newton EPIC-pn observations, utilizing the SIXTE and SIMPUT frameworks.
The simulations are tailored to produce training datasets for deep learning algorithms aimed at super-resolution enhancement and noise reduction of XMM-Newton EPIC-pn data. For details on the deep learning implementation, refer to the xmm-superres-denoise repository. The findings from this research are documented in the paper Deep Learning-Based Super-Resolution and De-Noising for XMM-Newton Images, published in MNRAS, 517, 4054 (2022). Please cite this publication when using the simulation code for your research.
Given the complex dependencies and the need for external software, the code is best run in a Docker container. See the Installation Guide for detailed instructions on installing the necessary software, optionally using Docker.
The good thing: You'll need to fill out config.toml
only once! Every step relies on this configuration file and everything will be done accordingly. This file is divided into environment
, energy
, download
, simput
and simulation
:
This gives paths and some general information to the code:
working_dir
: Directory where the files will be saved to, while there are being worked onoutput_dir
: If you're not running the code on a K8s cluster as I am, you can set this directory to the same value asworking_dir
. Ifoutput_dir
andworking_dir
are not the same, the code will create a tarball from the data inworking_dir
and move it tooutput_dir
. This way the slow transfer speed of CephFS for small files is circumvented.log_dir
: Directory where all the logs will be created. The code rotates the files every hour and keeps at most three files.debug
: Switches of multiprocessing and gives more logs.verbose
: Controls how much should be logged.fail_on_error
: Sometimes errors may happen, which are not necessarily critical. For example: The download of one file from the Illustris Project could file, but everything else might work fine. Iffail_on_error
istrue
, then this will crash the program.overwrite
: Controls if already existing files will be overwritten. Setting this tofalse
could be useful if you want to avoid overwriting data that you have created previously. I would recommend to keep this totrue
and move any previously created files to some other directory.consume_data
: Leave this atfalse
ifworking_dir == output_dir
! Otherwise files will be deleted to soon. This is mainly for my use case of running the code on K8s.
Set the energy boundaries in keV
:
emin
emax
num_processes
: How many processes should be run asynchrounosly. Recommended: As many CPUs as you have.top_n
: How many cutouts to download for given simulations.resolutions
: For every downloaded cutout, create images in these given resolutionssnapshots
: A dictionary of to-be-used snapshots with the corresponding redshift (see e.g. TNG100-1). The IllustrisTNG project has snapshots 0 - 99.simulations
: What simulations to consider with which width. Available are (with all of their sub-resolutions):TNG50
,TNG100
, andTNG300
. The width is given as a tuple ofint
andstr
. If you don't want to use one simulation, then just delete it out ofconfig.json
.modes
: There are two modes to createFITS
: projection and slice. The values given in the list are the axis for which the projection/slicing should be done. Both support the same values (x
,y
,z
). If you want to use only one of the modes, then leave the list of the other empty.
num_processes
: How many processes should be run asynchrounosly.filter
: What XMM filter to use. Available:thin
,thick
,med
. Only relevant for modebkg
(see below)zoom_range
: From what range to randomly chose a zoom factor.sigma_b_range
: The brightness sample range. This is based on the std of 50ks background. I.e.sigma_b = 10
will result in a brightness of 10 times the background at 50ks.offset_std
: The standard deviation of the normal distribution of the offset location around the bore-sightnum_img_sample
: How many simputs to create for previously downloaded files.modes
: For what modes to create simputs. Available modes:img
,agn
,bkg
(short for background). Set the value to0
if none should be created. The modeimg
supports-1
, which will create simputs for all of the previously downloaded files. The modebkg
only supports a boolean value (or 0 and 1 accordingly).instruments
: Only relevant for the modebkg
: For what instruments should a background simput be created. Available instruments:epn
,emos1
,emos2
.
num_processes
: How many processes should be run asynchrounosly.instrument_names
: What instruments should be simulated. Available instruments:epn
,emos1
,emos2
.filter
: What XMM filter to use. Available:thin
,thick
,med
.res_mults
: What resolution multiplication to simulate, e.g., 1x, 2x, 4x, etc.max_exposure
: Max exposure to be simulated.modes
: For what modes to run the instrument simulations. Available modes:img
,agn
,bkg
(short for background). Set the value to0
if none should be created. The modesimg
andagn
support-1
, which will run the simulation for all of the previously created simputs for that mode.sim_separate_ccds
: If the individual CCDs of XMM should be simulated or if they should be considered as "one big CCD".wait_time
: If not 0, then Out-Of-Time events will be simulated.
Assuming you have the Docker image ready (see the Installation Guide), you can run the code in the Docker container. First navigate to the root of this project:
cd /path/to/xmm-epicpn-simulator
Next we can run the Docker container and mount the current directory into the container. This way the code is available in the container and the results will be saved on your local machine. Run the following command:
docker run --rm -it -v $(pwd):/home/xmm_user/xmm-epicpn-simulator samsweere/xmm-epicpn-simulator:latest
The --rm
flag will remove the container and the volume after it has finished running. The -it
flags are for interactive mode.
Optionally you can replace $(pwd)
with the path to the directory where the xmm-epicpn-simulator
code is located.
Note that the code will write the results to the xmm-epicpn-simulator/data
directory. It needs to have write permissions to this directory. By default the docker will run with the uid of 1000
. If your user is not 1000
, then you'll need to change the permissions of the directory. You chan check your user id by running:
id -u
You can change the permissions of the directory by running. First if it doesn't exist, create the directory:
mkdir /path/to/xmm-epicpn-simulator/data
Then change the permissions:
sudo chmod -R 777 /path/to/xmm-epicpn-simulator/data
First make sure your working directory is the root of this project (also when running this from the docker container). I.e.
cd /path/to/xmm-epicpn-simulator
The code is split up into different steps, represented by different scripts. If you want to go through the whole process, then you must execute the steps in the correct order. They are numbered accordingly. There are following steps:
01_download_files.py
: Download files from the Illustris Project. Before you can do that you'll need an API key. For this check out their registration page. After your request has been approved, you'll see your personal API key after you login. Please keep this key to yourself! If you do not want to re-enter the api key every time you can add it to a.env
file in the root of the project. The file should look like this:
TNG_API_KEY="{your_api_key_here}"
By default the code will use config.toml
as the configuration file. If you want to use another file, then you can pass the path to the file as a command line argument (--config_path
). The script will then use this file instead of config.toml
.
-
02_generate_simput.py
: Create SIMPUT files based on the previously downloaded files. -
03_xmm_simulation.py
: Simulate XMM-Newton for the previously created SIMPUT files. TBD: I will add at least one other satellite to choose from. -
04_combine_simulations.py
: Not used right now! I will rewrite this step to merge images from different satellites/different sensors.
Executing any of the scripts is same for both setups:
- Set your configuration parameters as needed (see above)
- Initialise external tools:
. ${HEADAS}/headas-init.sh && . ${SAS_DIR}/setsas.sh && . ${SIXTE}/bin/sixte-install.sh
-
Choose what step you want to run
-
Run
conda run -n xmm --no-capture-output python /path/to/script
with the needed command line arguments:-
01_download_files.py
requires two arguments:-k
followed by your personal Illustris API key (see below)-p
followed by the path to theconfig.json
-
02_generate_simput.py
requires three arguments:-a
followed by the path to theagn_counts.cgi
file inres
-p
followed by the path to theconfig.json
-s
followed by the path tores/spectrums
-
03_xmm_simulation.py
requires one argument:-p
followed by the path to theconfig.json
-
For our XMM simulations we need sources to simulate (simulation input). In our project we are especially interested in extended sources. We take these extended sources from the Illustris TNG project (https://www.tng-project.org/). This is a large cosmological hydrodynamical simulation of galaxy formation containing hundreds of terabytes of simulated data. From this we take the most massive objects and take x-ray projections and x-ray slices (less realistic but contains more clearly defined structure). Note that cutout files are relatively large (100-1000 mb) and can take a while to download, it will first download all the relevant cutouts before generating the images.
To create the simput for extended sources we use fits image files. In order to have a realistic distribution we augment these images using:
- Brightness: The brightness of the source is internally defined as sigma_b. This is based on the std of 50ks background. I.e.
sigma_b = 10
will result in a brightness of 10 times the background at 50ks. The images are used as a distribution of a given brightness. We determine the final brightness by taking a center cutout of the image and set this to the brightness defined by sigma_b. - Location: We augment to location by offsetting the image from the bore-axis. Since real xmm observation are usually focussed on the center of extended sources we by default offcenter the images by a small amount around the bore-sight based on a normal distribution.
- Size (zoom): We augment the size of the extended source by artificially zooming in or out.
The XMM simulations are done using SIXTE X-ray simulation software (https://www.sternwarte.uni-erlangen.de/research/sixte/). All the elements that make up a XMM observation are simulated separately: extended source, agn and background. These can then in the future be combined with a detector-mask to create a realistic XMM observation. Since this is a simulation we can also simulate observations where XMM has a higher resolution (both spatial and psf wise).
Many thanks to Bojan Todorkov for his code improvements and bug fixes to the codebase!