-
Notifications
You must be signed in to change notification settings - Fork 625
FDS Profiling and Tracing
This page provides notes on performing profiles and traces of FDS using Score-P, Scalable Performance Measurement Infrastructure for Parallel Codes. This software builds instrumentation into the FDS executable that generates either a profile or trace of a given FDS simulation. A profile is basically an accounting for how much CPU time is expended by various FDS routines and MPI/OpenMP calls. A trace is a time history of the CPU usage by each MPI or OpenMP process.
Note that these notes only pertain to linux systems.
A special entry for the Score-P compilation has been added to the makefile
in FDS_Compilation
:
mpi_intel_linux_64ib_scorep : FFLAGS = -m64 -O2 -ipo -traceback $(GITINFO)
mpi_intel_linux_64ib_scorep : LFLAGS = -static-intel
mpi_intel_linux_64ib_scorep : FCOMPL = scorep --mpp=mpi mpifort
mpi_intel_linux_64ib_scorep : FOPENMPFLAGS =
mpi_intel_linux_64ib_scorep : obj = fds_mpi_intel_linux_64ib_scorep
mpi_intel_linux_64ib_scorep : setup $(obj_mpi)
$(FCOMPL) $(FFLAGS) $(LFLAGS) $(FOPENMPFLAGS) -o $(obj) $(obj_mpi)
Notice that scorep --mpp=mpi
has been added in front of the compiler mpifort
, and the OpenMP directives have been deleted so that we only focus on MPI functionality. Score-P can analyze OpenMP as well, and if you want to do that, add back the OpenMP compiler options.
The first thing to do after compiling the instrumented executable is to perform a profile. Use the qfds.sh
script in Utilities/Scripts
to generate a PBS batch script:
qfds.sh -p 8 -v -e /home/mcgratta/FireModels_fork/fds/Build/mpi_intel_linux_64ib_scorep/fds_mpi_intel_linux_64ib_scorep case_name.fds
The -v
directs FDS to simply write out the script. Save it to a file. It will look something like the following:
#!/bin/bash
#PBS -N case_name
#PBS -e /home/mcgratta/.../case_name.err
#PBS -o /home/mcgratta/.../case_name.log
#PBS -l nodes=4:ppn=2
#PBS -l walltime=999:0:0
export OMP_NUM_THREADS=1
cd /home/mcgratta/.../working_directory
export SCOREP_ENABLE_TRACING=false
export SCOREP_ENABLE_PROFILING=true
export SCOREP_EXPERIMENT_DIRECTORY=my_profile
export SCOREP_FILTERING_FILE=''
export SCOREP_TOTAL_MEMORY=500MB
echo `date`
echo "Input file: case_name.fds"
echo " Directory: `pwd`"
echo " Host: `hostname`"
/shared/openmpi_64ib/bin/mpirun --report-bindings --bind-to socket --map-by socket -np 8 /home/mcgratta/FireModels_fork/fds/Build/mpi_intel_linux_64ib_scorep/fds_mpi_intel_linux_64ib_scorep case_name.fds
The SCOREP
environment variables indicate whether to do a profile or a trace, the name of the directory (under the working directory) to store the results, a filter file, and the amount of memory required to perform the operation. The SCOREP_FILTERING_FILE
contains a list of the FDS routines to include in the profile or trace. However, when you first do a profile, leave out the
Execute the script
qsub script
and after the job finishes, type
scorep-score -r my_profile/profile.cubex
and you will see a summary of the CPU time usage for all routines. Issue the command
scorep-score -f $FDSSMV/Utilities/Profiling/scorep_include.filt profile/profile.cubex
and you will see a profile of only those routines listed in the file $FDSSMV/Utilities/Profiling/scorep_include.filt
. In particular, you will see the SCOREP_TOTAL_MEMORY
required to perform a trace of those same routines, which is described next.
Once you have finished the profile, edit your PBS script as follows:
export SCOREP_ENABLE_TRACING=true
export SCOREP_ENABLE_PROFILING=false
export SCOREP_EXPERIMENT_DIRECTORY=my_trace
export SCOREP_FILTERING_FILE=$FDSSMV/Utilities/Profiling/scorep_include.filt
export SCOREP_TOTAL_MEMORY=27MB
The SCOREP_FILTERING_FILE
contains a list of the major subroutines called from the main routine. These are the routines that we want to trace; that is, get a time history of when each MPI process begins and ends that particular routine. The total memory required is obtained from the profiling. For FDS, the filter file might look something like this:
SCOREP_REGION_NAMES_BEGIN
EXCLUDE *
INCLUDE
MAIN__
pres.pressure_solver_
pres.compute_velocity_error_
divg.divergence_part_1_
divg.divergence_part_2_
velo.velocity_predictor_
velo.velocity_corrector_
velo.match_velocity_
velo.no_flux_
mass.density_
wall_routines.wall_bc_
velo.compute_velocity_flux_
velo.velocity_bc_
velo.viscosity_bc_
rad.compute_radiation_
fire.combustion_
mass.mass_finite_differences_
init.open_and_close_
SCOREP_REGION_NAMES_END
Run the case again, and now you will have a file in the directory my_trace
called traces.otf2
. The suffix otf2
(Open Trace Format 2) is a commonly used file format for event traces. There are a variety of visualization tools available that can read and graphically display the event timeline which helps identify bottlenecks in the code execution.