- Background and Motivation π§ββοΈ
- Description of Experiments
- Hello World Example
- Reproducing the Results
- How π§ββοΈis Implemented
Count Buffy (π§ββοΈ) is a verification tool for performance verification of contention points (CPs) in the network.
Terminology:
- Buffer: A fifo queue of packets with finite capacity
- Packet Stream: A sequence of packets
- Contention Point: A processing unit that reads packet from a set of input Buffers and produces a set of output packet streams
- Workload: A predicate over free input buffers of the contention point (those that are not connected to another contention point)
- Query: A predicate over the buffers of the contention point
On a high level, input and output of a contention point is a set of packet streams. Input packet streams are fed into Buffers and CP reads the buffered packets, and don't directly have access to the input stream. CP then processes the packets and produces output stream accordingly. Workload/Query is an abstract representation of a set of inputs/outputs to/from the CP.
Queries are performance properties, i.e., they specify performance related properties of the buffers like throughput.
The goal of verification task is to verify whenever we feed traffic consistent with the workload into the CP, the Query is satisfied (a performance property defined over the buffers).
For instance, assume that we have a rate limiter that limits the output traffic rate to 1 packet per two timesteps. Assume that we analyze the rate limiter for 10 time step. Our goal is to verify that with any traffic that sends at least 5 packets (within the 10 timesteps) into the rate limiter, we see at most 5 packets in the output traffic, i.e., at most one packet every two timesteps.
Without verification, we need to manually construct some input traffic with these specification, feed them into the rate-limiter, and then check if the output is rate-limited.
With verification, we represent all possible traffic that with a few lines of specification (workload), and use the verification tool to verify that the CP satisfies the performance query.
Efficient verification of CPs with arbitrary large buffer sizes.
The main novelty of π§ββοΈmodeling is that it allows scaling the buffer size and efficiently analysing CPs with large buffer sizes. The previous tool for performance verification (FPerf), was only capable of very small and unrealistic buffer sizes (β€10).
FPerf uses a certain grammar to specify workloads. Using a CEGIS style search through the state space of the possible workloads allowed by the grammar, FPerf finds a Workload that always satisfies the Query.
FPerf works by using a search loop. Inside the loop FPerf generates a new random workload, and then verifies whether workload always satisfies the query. To verify most of the workloads generated during the search, FPerf calls a verification engine.
We show that using π§ββοΈgives significantly better verification time than the formal model of CP used in FPerf when we increase the buffer size. In fact, we show that, size of buffer has negligible effect on the verification time in π§ββοΈwhile FPerf verification time increases significantly when buffer size gets larger and becomes intractable for larger and realistic buffer sizes.
In each test-case, FPerf generates a set of workloads and verifies each using Z3. We record those workloads and verify them using π§ββοΈ, simulating the situation where we keep the FPerf's search algorithm, while replacing the formal model of CP as designed in π§ββοΈ.
We compare the average and P95 percentile verification time of individual calls to the verification engine for increasing buffer sizes.
Install Docker
git clone https://github.com/all-things-networking/count-buffy.git
cd count-buffy
git submodule update --init --recursive
cp .env.example .envdocker compose pullThe examples directory includes a simple example of using π§ββοΈ to
model a rate-limiter CP and verify certain workload and query.
docker compose run --rm buffy hello-worldThe output shows a sequence of input and output traffic. Each sequence shows the number of packets at each time step.
In this example we verify if we have at least five input packets within 10 timesteps, always we have at most 5 packets in the output.
In the examples/main.cpp, first we verify the satisfiability of the
workload and query to see an example of how input and output traffic
that satisfy both workload and query looks like:
auto m = sts->check_wl_and_query_sat();
sts->print(m);In the output traffic, there should be at most one packet within each two subsequent time steps.
Next, we verify that this workload always satisfies the query. To verify this, we check for the unsatisfiability of the workload and negation of the query, i.e., we verify that it is impossible to feed at least five packets into the rate-limiter and have more than five packets in the output.
| Experiment id | Description | Entry Point File |
|---|---|---|
| rr | Round-Robin Scheduler | rr.cpp |
| prio | Strict-Priority Scheduler | prio.cpp |
| fq | FqCoDel Scheduler | fq.cpp |
| loom | Compositional | loom.cpp |
For performance evaluation of π§ββοΈ we use the 4 case studies used in FPerf. For each case study, we run the FPerf search for various buffers sizes and recorded all workloads generated during the search. Then we checked these workloads in π§ββοΈ to ensure we get the same SAT/UNSAT result and compare the time of each call to the verification engine. This way we can compare the efficiency of π§ββοΈ against FPerf.
Data for our experimental results include:
- Workloads generated by FPerf
- Output logs of π§ββοΈ that includes verification time of each workload
- plots to compare the average verification time of workloads in π§ββοΈ and FPerf
To reproduce the result claimed in the paper, we provide two steps:
- Using a pre-existing set of workloads, verify them in both FPerf and π§ββοΈand draw plots comparing the times.
- Generate the workloads the proceed to the step 1.
The main reason for breaking the experiments reproduction this way is that generating workloads from FPerf is much more time consuming. As a result, as a part of this repository, we provide a set of pre-recorded workloads. However, we also provide the instructions and scripts for the step 2, so the workloads can also be re-generated.
For this experiment, we are using a set of pre-recorded workloads generated during FPerf search stored in
data/sub_wls.
The directory includes sub-directories for each experiment.
For each case study we have separate workloads file for different buffer sizes.
We used the queries as defined in FPerf and implemented
as the query method in the sub-classes of the STSChecker.
These workloads files include an entry for each workload that FPerf generates and verify with Z3.
The following is an example a workload entry.
The header line (starting with ###) includes the verification time of FPerf (in ms) and verification result (
SAT/UNSAT).
Following lines specify the workload's constraints.
### - Time: 45 Res: SAT
[1, 6]: cenq(3, t) < 2t
[7, 13]: SUM_[q in {0, 1, 2, 3, 4, }] cenq(q ,t) < 7t
The steps are as follows:
1- Verify each workload file in FPerf
2- Verify each workload file in π§ββοΈ and record the verification time
3- Draw plots comparing the average verification of time in FPerf vs π§ββοΈ
To make sure the setup is correct and everything is working,
use the following command to so that all the following steps run only for the prio
case study and for buffer sizes 10,25,50:
cp .env.example.small .envThis way you can quickly run all the following commands and make sure everything is working. Next, to run all experiments use the following command:
cp .env.example .envThe following scripts verifies all workload files for all case studies in FPerf:
docker compose run --rm buffy fperf_verify_all_workloads.shNow, we verify all workloads in π§ββοΈ. We feed a workload file into π§ββοΈ and output is a log file that includes a line per workload specifying the verification time and the SAT/UNSAT result:
CP, buf_size, wl_idx, time_millis, solver_res
prio,10, 0, 27, SAT
prio,10, 1, 33, SAT
prio,10, 2, 9, SAT
The log files of π§ββοΈ are saved into the data/logs directory.
To start the experiment, first we clear the existing π§ββοΈ log files:
rm -rf data/logs data/plotsNext, we verify all workloads in the data/sub_logs in π§ββοΈ:
docker compose run --rm buffy buffy_verify_all_workloads.shAfter a successful execution, π§ββοΈ log files are saved into the data/logs directory.
After the previous step, we have all the required data to compare the verification time of FPerf and π§ββοΈ:
data/logsincludes the output of π§ββοΈdata/sub_wlsincludes the output of FPerf (pre-recorded)
Now, we can use the following command to draw plots for all case studies:
docker compose run --rm buffy draw_all_plots.shplots are saved in the data/plots directory.
We previously executed FPerf search for each of the case studies and included the recorded workload
files in the data/sub_wls.
To generate a fresh version these files, we need to execute the FPerf search.
Use the following command to generate the workloads:
docker compose run --rm buffy fperf_search_all.shAfter a successful execution, workloads generated during the search are saved into the
data/new_wls.
Now, we can execute the previous steps again to compare the verification times using the newly generate workloads.
We set the
BUFFY_WLS_DIR=data/new_wlsto point the directory containing the new version of the generated workloads
Verify newly generated workloads in Buffy:
docker compose run --rm buffy fperf_verify_all_workloads.shWe can skip this step since we just generated the workloads using FPerf itself.
Verify newly generated workloads in π§ββοΈ:
docker compose run --rm -e BUFFY_WLS_DIR=data/new_wls buffy buffy_verify_all_workloads.shdraw the plots again:
docker compose run --rm -e BUFFY_WLS_DIR=data/new_wls buffy draw_all_plots.shTo explain how the implementation works and explain the code we walk through
the implementation of the prio case study.
The entry point is the prio.cpp file.
The program accepts two arguments: buffer size and a flag specifying whether
use window constraints or not.
To implement the strict priority scheduler, we derive the sub-class from STSChecker.
An STSChecker includes the model of the CP as well as methods for verifying
combinations of workloads and queries.
We then use StsRunner to read workloads of the case study for a single buffer size,
verify each workload, and save the verification times in a log file.
This class includes the CP's model, base workload and query for a specific case study. Each CP is implemented as a transition system described below.
To implement the CP's behavior, we need to override three methods of the STSChecker: init, trs and out.
Each of these methods, return a set of constraints to be added to solver.
vector<NamedExp> init(ev const &b0, ev const &s0)This method specifies the initial state of the CP.
b0 is a vector of boolean expressions, one for each input
buffer of the CP.
Each b0[i] specifies whether the ith input buffer is backlogged at time zero.
s0 is a vector of numeric expressions, representing arbitrary state variables
of the CP at time zero.
For instance, in the round-robin scheduler, we use the state variable to record
the index of the last dequeued buffer, so we can find the buffer that should
dequeue next.
For the strict priority, we don't need state variables.
The following method shows the init method of the priority scheduler.
Basically, here we are allowing any initial state, so we are returning a
single true constraint.
vector<NamedExp> PrioSTS::init(const ev &b0, const ev &s0) {
return {slv.ctx.bool_val(true)};
}This method defines the transition relation, i.e., the relation between each pair of subsequent states.
Similar to init, it receives the backlog and state variables vectors.
However, it receives two pair of such variables, one for timestep t and
another pair for timestep t+1.
The method should then return constraints that relate these variables between
timestep t and t+1.
vector<NamedExp> PrioSTS::trs(ev const &b, ev const &s, ev const &bp, ev const &sp, int tp) {The following method shows the transition relation of the prio scheduler:
vector<NamedExp> PrioSTS::trs(ev const &b, ev const &s, ev const &bp, ev const &sp, int tp) {
vector<NamedExp> rv;
expr res = slv.ctx.bool_val(true);
for (int i = 0; i < num_bufs; ++i) {
for (int l = i + 1; l < num_bufs; ++l) {
res = res && (implies(b[i], implies(b[l], bp[l])));
rv.emplace_back(res);
}
}
return rv;
}Here we are enforcing the constraint b[i] => (b[l] => bp[l]) for all l > i.
Assuming that lower indices have higher priority, basically here we are
enforcing that if a lower priority buffer is backlogged, then no lower
priority buffer can dequeue a packet in current time step (b[l] => bp[l]).
The final method for a complete implementation of the priority scheduler, is
the out method:
vector<NamedExp> PrioSTS::out(const ev &bv, const ev &sv, const ev2 &ov, int t) The out method includes constraints for specifying how the CP's output
(packets to be dequeued) should be calculated based on the current state of the
CP.
So, it receives the vector of backlogs bv, vector of state variables sv and
the vector of output packets ov for timestep t.
out should return constraints that relate bv and sv with ov.
The following method shows the out method of the priority scheduler:
vector<NamedExp> PrioSTS::out(const ev &bv, const ev &sv, const ev2 &ov, int t) {
vector<NamedExp> rv;
expr res = slv.ctx.bool_val(true);
expr not_until = slv.ctx.bool_val(true);
for (int i = 0; i < num_bufs; ++i) {
res = res && ite(not_until && bv[i], ov[i] == 1, ov[i] == 0);
not_until = not_until && (!bv[i]);
rv.emplace_back(res);
}
return rv;
}Simply, we start from the first buffer, and dequeue a packet from the first non-empty buffer.
These methods together define the logic of the CP, independent of how the underlying buffers are modelled.
The constraints for modelling the buffer behavior are implemented in the STSChecker,
and shared by all derived CPs.
Finally, to complete the case study, we need to implement the base workload and query methods.
Base workload returns a set of constraints over the inputs to the buffers and query returns a set of constraints over the outputs from the buffers.
The case study for the priority scheduler doesn't include a base workload, so here we only explain the query.
The following method shows the query method of the priority scheduler.
The query in the case study specifies whether there exists a timestep
t where the third input buffer is blocked for more than five timesteps.
vector<NamedExp> PrioSTS::query() {
expr res = slv.ctx.bool_val(false);
for (int i = 0; i < timesteps - QUERY_TRESH + 1; ++i) {
expr part = slv.ctx.bool_val(true);
for (int j = 0; j < QUERY_TRESH; ++j) {
part = part && B[2][i + j];
part = part && (O[2][i + j] == 0);
}
res = res || part;
}
return {res};
}To know whether a buffer is blocked we check if it is backlogged and it doesn't output any packets.
This class is responsible for reading the individual workloads from file, and translating them into constraints over the input buffers.
It then uses the Z3 solver, to verify the model of the CP alongside the base workload, query and the workload generated by FPerf. We also read the expected verification result from the FPerf's output, and raise an exception if our result is different from FPerf.