This directory contains the scripts for running Pinned Loads.
Before using these scripts, you need to set the following environment variables:
export GEM5_ROOT=<path to Pinned Loads>
export M5_PATH=<path to the directory that contains full-system images and disks>
export WORKLOADS_ROOT=<path to benchmark suites root>
Note that the workload directory must be structured appropriately before using any of the scripts. Please refer to this Section for more information.
We use the SimPoint methodology to generate up to 10 representative intervals that accurately characterize end-to-end performance. Each interval has a gem5 checkpoint that allows Pinned Loads to resume from.
We use the simmedium input size and run full-system simulation for the region of interest (ROI). For more information about running full-system simulation in gem5, here are some useful resources:
Under the workload directory, there should be three subdirectories: SPEC17
,
PARSEC
, and SPLASH2X
, which correspond to the three benchmark suites we used.
The SPEC17
directory has the following structure:
SPEC17
├── ckpt # SimPoint checkpoints
│ ├── blender_r # SPEC2017 application name
│ │ ├── cpt.None.SIMP-0 # SimPoint checkpoint for interval 0
│ │ │ ├── m5.cpt # standard gem5 checkpoint file
│ │ │ └── system.physmem.store0.pmem # standard gem5 checkpoint file
│ │ │
│ │ ├── cpt.None.SIMP-1 # SimPoint checkpoint for interval 1
│ │ │ ├── m5.cpt
│ │ │ └── system.physmem.store0.pmem
│ │ │
│ │ └── ... # more checkpoint dirs
│ │
│ └── ... # more SPEC2017 applications
│
└── run # directories that contain SPEC2017 binaries and input files
├── blender_r # SPEC2017 application name
│ ├── blender_r # binary, it should have the same name as the application
│ ├── results.simpts # SimPoint region information, generated by SimPoint
│ ├── results.weights # SimPoint weight information, generated by SimPoint
│ └── ... # more misc files used by the benchmark
│
└── ... # more SPEC2017 applications
The PARSEC
or SPLASH2X
:
PARSEC/SPLASH2X
├── blackscholes # PARSEC/SPLASH2X application name
│ └── simmedium # input size simmedium
│ └── sample # a directory of at least one checkpoint
│ └── cpt.2398775696500 # checkpoint at ROI begin, the number after "cpt."
│ │ # is the simulation tick when the checkpoint is created
│ ├── m5.cpt # standard gem5 checkpoint file
│ ├── system.pc.south_bridge.ide.disks.image.cow # standard gem5 checkpoint file
│ └── system.physmem.store0.pmem # standard gem5 checkpoint file
│
└── ... # more PARSEC/SPLASH2X applications
Scripts spec.sh
and parsec.sh
launch Pinned Loads from a single checkpoint.
spec.sh
can run SPEC17 benchmarks and parsec.sh
can run both PARSEC and SPLASH2X
benchmarks.
These scripts require at least three arguments:
-b
,--bench
: name of the benchmark;-s
,--simpt
: checkpoint ID;--suite
: path to the root of the benchmark suite.
Note that for PARSEC & SPLASH2X checkpoints, the checkpoint ID is the numeric order of the checkpoint (w.r.t. all other checkpoints in the directory) that you want to resume from, starting from 1, sorted by the simulation tick (i.e., the number after "cpt."). For example, if the directory has two checkpoints:
cpt.10000/ # checkpoint ID: 1
cpt.20000/ # checkpoint ID: 2
These scripts also take some optional arguments:
-t
,--threat-model
: threat model, can beUnsafe
,Spectre
, andComprehensive
;-H
,--hardware
: hardware scheme, can beUnsafe
,Fence
,DOM
, andSTT
;-i
,--maxinsts
: max number of instructions to simulate; for multi-threaded workloads, simulation exits when any thread reaches the maximum instruction count;-d
,--delay-inv
: enable Pinned Loads; only meaningful under the Comprehensive threat model;--l2-par
: number of LLC (L2) virtual partitions, it can be 1 to enable Late Pinning or number of cores (e.g., 8 cores for our evaluation) to enable Early Pinning;--l1-cst
: geometry of L1-CST; format: "<#Entry>X<#Record>"; default: "12X8"; only meaningful in the Early Pinning mode;--l2-cst
: geometry of Dir/LLC-CST; format: "<#Entry>X<#Record>"; default: "40X2"; only meaningful in the Early Pinning mode;--ext
: prefix of the output directory, which will be$GEM5_ROOT/output/<ext>/<bench>/<checkpoint ID>
.
For parsec.sh
, you can further specify the path to the kernel binary and disk image with:
--kernel
: path to the kernel binary, relative toM5_PATH
;--image
: path to the disk image, relative toM5_PATH
.
Resume a SPEC17 benchmark, blender_r
, from its 1st checkpoint,
then run it with the Fence scheme under the Spectre threat model for 1 million instructions.
./spec.sh -b blender_r -s 1 -H Fence -t Spectre -i 1000000 --suite $WORKLOADS_ROOT/SPEC17
To reproduce the Figure 7, 8, and 9 in our paper,
we provide two scripts runner
and plotter
to submit jobs and process results.
Before you start, please make sure that all the required Python libraries are installed.
Execute:
./runner submit SPEC17
./runner submit PARSEC
./runner submit SPLASH2X
will submit all the required jobs to HTCondor. It takes about 10 minutes to finish job submission.
Note that, because of the limited computing resources that we could provide for artifact evaluation, our script uses a reduced maximum number of instructions for PARSEC & SPLASH2X (25 million instructions), instead of 1 billion instructions or reaching the end of ROI. Since a full run on one ROI can take several days to finish.
There will be ~3600 jobs in total (~3200 SPEC17 jobs and ~400 PARSEC & SPLASH2X jobs,
the exact number depends on how many checkpoints each benchmark suite has).
On average, a job takes ~20 minutes (SPEC17),
~1 hour (PARSEC & SPLASH2X) to finish.
So, you can estimate the total execution time based on the number of condor slots you have on your cluster.
In our environment, which has 80 slots, it takes about 1 day to finish all the jobs.
You can use command ./runner status
or condor_q
to check job status.
After the jobs are finished, you can collect the results by executing:
./runner collect
which should generate a file named data.csv
that contains execution overhead for each benchmark and configuration,
normalized to an unsafe baseline.
Then, execute:
./plotter perf
./plotter breakdown
will generate Figure 7 (perf-spec.pdf
) & 8 (perf-sp.pdf
),
and Figure 9 (brkd.pdf
) respectively.
Because the benchmark checkpoints can be different from the ones in our evaluation, the collected plots may not exactly match their corresponding figures in our paper, but they should be similar.