-
Notifications
You must be signed in to change notification settings - Fork 2
Home
ABE is written for studying the co-evolution of the supermassive black holes with its host galaxies, particularly looking at the effect of AGN feedback. It uses Massive Black-Hole II simulation snapshots based on GADGET 3 to explore SBHs and its environment. Single redshift snapshot of this simulation is of size 1 TB. The python scripts implemented with Message Passing Interface help to handle the big data of MB II simulation snapshots.
Cosmological Simulation of Galaxy Groups and Clusters-I: Global Effect of Feedback from Active Galactic Nuclei ApJ 889-1, 60; arXiv:1911.07824
Snapshot is the set of files saved during the Gadget simulation. It is the picture of the simulated universe at a particular cosmic time. In MB-II simulation, a single snapshot data comprises of 1024 files each sizing 1 GB. MB II simulation contains particles like Dark Matter, Black Holes(hereafter BH), gas particles, stars and so on. If you want to know more about the snapshot, read user guide of Gadget-2. Each file of this snapshot contains on an average 70 BHs. But if you want to find gas particles around one of this BH within a certain range, say 25Kpc, you need to look into every other 1023 files. In that sense, the snapshot individual files are not complete for a particular space of simulation. As a whole, it is.
ABE-I deals with the big data in three different sequential procedures. First, it scans the entire data and saves what it needs, also catalogue pieces of information about saved data. Next is a high-performance computation. Using the saved information ABE-I analyses the entire data deeply, but this time it looks only where it wants to look. The three main scripts are:
- CatalogMaker_MPI.py
- CatalogAnalyser_MPI.py
- DataAnalyser_MPI.py
Other two python scripts which are used for formatting the outputs of main scripts are:
- Combiner.py
- DataEditor.py
See the detailed Flowchart here.
CatalogMaker reads the snapshot files iteratively and saves BHs and surrounded gas particles and its properties. The user has the option to exclude the BHs whose mass is less than a certain cutoff value. It saves data in python pickle binary format. The outputs of CatalogMaker are explained below.
1.data.p: Python dictionary, keys are BH IDs, and the corresponding values are NumPy arrays having the gas particles' internal energy, density, position-x, position-y, position-z, smoothing length and electron abundance.
2.mass.p: Python dictionary, keys are BH IDs and the corresponding values are the BH masses.
3.bh_cat.p: Python array having BH IDs and their position. The data type of this array is the string (to convert the data type to float see lines 140-143 in CatalogAnalyser_MPI.py).
4.snap_cat.p: Python dictionary, keys are the filenames and corresponding values are the arrays of BH IDs in that files.
5.acc.p: Python dictionary, keys are BH IDs and the corresponding values are their accretion rates.
To Run this code use mpiexec -n 8 python CatalogMaker_MPI.py
The output of the CatalogMaker is dumped from every processor. That means, If you use 8 processor for executing CatalogMaker then there will be 8 files for each of the above-listed files. So for combining it into a single file, you need to run python Combiner.py
See the detailed Flowchart here.
CatalogAnalyser reads snapshot files iteratively and for each file, it checks which of the BHs in bh_cat.p has gas particles inside. If it finds gas particles then it updates data for that BHs in data.p and then goes to the next file. Updated data will be saved after the execution as dataUpd.p.
To Run this code use mpiexec -n 8 python CatalogAnalyser_MPI.py
DataEditor finds the temperature of the gas particles using internal energy and Electron abundance.It removes the internal energy and Electron abundance arrays from dataUpd.p, then it adds Temperature array and save it as dataEdit.p.
dataEdt.p: Python dictionary, keys are BH IDs, and the corresponding values are NumPy arrays having the gas particles' temperature, density, position-x, position-y, position-z and smoothing length.
See the detailed Flowchart here.
This Code uses dataEdt.p, mass.p, and acc.p. BHs are binned according to their masses with a user-defined bin size. After binning, it finds the average flux of each BH, average flux of BHs in a bin, and the stacked maps of BHs in the bins. It saves few data files:
1.stack.p: Python dictionary, keys are bin Nos. and corresponding values are stacked map arrays (100, 100)
2.tab.p: Python dictionary, keys are bin Nos. and corresponding values are arrays. Each array has average flux, average mass, and average accretion of that bin
3.bin.p: Python dictionary, keys are bin Nos. and corresponding values are the list of BHs in the bins.
4.lum.p: Python dictionary, Keys are BH IDs, and corresponding values are the average flux of BHs
After the execution of DataAnalyser, you need to run Combiner again to combine the lum.p dumps. Combine them using python Combiner lum
. Argument 'lum' is mandatory.
Clone this GitHub repository using,
git clone https://github.com/antolonappan/ABE-I.git
Apart from the above-mentioned scripts, you will find one more python script, initial.py and a configuration file, abe.ini.
This script will do all initial setups for running ABE. Inside the root folder specified in the configuration file, it creates the folder with name as the date of running. It helps the user to identify the previous runs. If ABE is running multiple times in a day then it creates the folders with date+time as the name. It also edits abe.ini for specifying the output and log directories.
[inputs]
snapshot = /home/snapshot # snapshot directory path, 'don't' put '/' at the end.
[outputs]
root_dir = Runs # Output folder name. If it doesn't exist, initial.py creates it
[misc]
mode_run = 'automated' # Mode of running, 'automated' or 'individual'.
agn_matrix_x = 100 # X and
agn_matrix_y = 100 # Y axis no. of pixels resolution
box = 25 # Selection range of gas particles in Kpc.
mass_cutoff = T # 'T' For taking a mass cutoff, if 'F' ABE takes full BHs for analysis
lower_cutoff_value = 1e7 # If mass_cutoff is True then Specify the cutoff values here
upper_cutoff_value = 1e10
no_of_cores = 4 # No. of cores using for running ABE.
delete_dump = T # If True ABE won't keep any dump files of data, bh_cat, snap_cat, mass, acc, lum
bin = 0.2 # Binning size in log10 scale.
[live] # This is part is edited by initial.py and other main programs. But if you are running
# ABE without abe.sh then you need to specify the output and log directories here.
cat_mak_out = Runs/9-4-2018/out/CatMak # Output directory of CatalogMaker
cat_anl_out = Runs/9-4-2018/out/CatAnl # Output directory of CatalogAnalyser
dat_anl_out = Runs/9-4-2018/out/DatAnl # Output directory of DataAnalyser
cat_mak_log = Runs/9-4-2018/log/CatMak # Log directory of CatalogMaker
cat_anl_log = Runs/9-4-2018/log/CatAnl # Log directory of CatalogAnalyser
dat_anl_log = Runs/9-4-2018/log/DatAnl # Log directory of DataAnalyser
pro_cm = 4 # If no_of_cores is not a total divisor of No. of snapshot files
# initial.py finds the total divisor and assign it here.
# So CatalogMaker uses only this much cores. If no_of_cores is a
# total divisor then pro_cm = no_of_cores
last_program = DataEditor.py # After Executing every program ABE saves the name of the program
# here. This helps the user to run scripts in the correct sequence
After configuring abe.ini you can run ABE in two modes:
This shell script automates the entire processes. It creates a folder 'RunStatus' where it saves outputs and standard errors of the python scripts. So the user can see the errors occurred while running the scripts. Finally, it tells you how much time it took to complete the analysis. For running abe.sh use
chmod +x abe.sh # first time only, for making it executable
./abe.sh
For running ABE in manual mode, one needs to edit abe.ini and run initial.py
. After that run the scripts in the order specified in section ABE-I.
ABE logs detailed running statuses for all main scripts.
CatalogMaker
After reading and saving required information from snapshot files, it logs the filename of the finished file. All cores log separately in log files starting with 'snap'.
CatalogAnalyser
This script has two log files: 'snapshot log information about finished snapshot files, logs starting with the name 'process' logs the status of every core
DataAnalyser
Log files of DataAnalyser are 'process', 'MassPopulation' and 'Luminosity'. 'Process' is logged by the parent core when it initiates different processes. 'MassPopulation' logs details of binning, no. of bins, ranges of bins which excluded in the analysis and so on. 'Luminosity' is logged by every core, it contains the status of execution.
Untar the Example.tar.gz to see an automated run of ABE. For this run, ABE used only 4/1024 snapshot files. You can find all logs and data files inside. To read any data or catalog with an extension '.p' open a python terminal and use following lines.
import _pickle as pl
with open('dataEdt.p', 'rb') as reader:
data = pl.open(reader)
Copyright © 2017 Department of Physics, Presidency University.
06-11-2018
Star Catalogue has been added
05-09-2018
Three more configurations added to abe.ini and corresponding changes are updated in the main scripts. Added configurations are X and Y pixels resolution for computing maps and mode of the running of ABE, whether automated or individual run. See abe.ini
Possible future updates
- ABE-II: Restart option. Start again from where it stopped the analysis
- ABE-X: GPU Accelerated version