-
Notifications
You must be signed in to change notification settings - Fork 6
Documentation for technical users
This is the technical documentation for the Texada project. It is intended for developers who wish to customize or extend the Texada code base. Therefore, it assumes familiarity with the usage of Texada as well as basic Linear Temporal Logic as described in the User's Page.
Texada is made up of three major components:
-
The instantiation tools are responsible for (TODO: expand) creating the set of instantiations from a set of propositions and a property type.
-
The parser module is responsible for translating a given log into a representation which can be processed by the checkers. Currently, Texada supports three different representations: vector-form, map-form, and prefix-tree-form.
-
The checker module is responsible for checking a set of instantiations over a trace-set, computing statistics for valid findings (support and support_potential) in the process.
Each of these modules are fairly self contained, and in the following sections we describe each in greater detail.
TODO: A high level description of what the Instantiation tools do, how they do it, and how the various tools relate to each other.
The parser class takes an input stream from a log file, and translates its contents into a representation which can be processed by the checker classes. The parser class hierarchy is set up so that the top-level, superclass---parser---parses events one by one from the log file, then passes them onto methods which build up the data structure. In order to support multiple forms of data representation, the methods for building up the data structures are implemented inside the subclasses. In this way, the majority of the heavy lifting involved with parsing the log file is abstracted away in the superclass, and the subclasses are responsible only for organizing the parsed elements into a checker-processable form.
In order to support both single-propositional and multi-propositional traces, Texada uses an event struct to abstract away the specific details of events.
The checker class takes a set of instantiations (produced by the instantiation tools), a set of traces (produced by the parser module), along with configuration information, and produces statistics of valid instantiations (the definition of validity will depend on the configuration of the checker).
TODO: Describe the checker class hierarchy
Within the linear checker and the boolean checker (which is used by the linear checker), there are two forms of short circuit optimizations which occur.
The first occur inside the method linear_checker::valid_instants_on_traces(...):
#!c++
vector<std::pair<map<string, string>, statistic>> valid_instants_on_traces(
...
// the meaning of validity will depend on the user inputed settings
bool valid = true;
statistic global_stat = statistic(true, 0, 0);
for (auto each = traces->begin(); each != traces->end(); each=traces->upper_bound(*each)) {
statistic trace_stat = checker.check_on_trace(instantiated_prop_type,&(each->at(0)));
// when there are multiple equivalent traces, the instantiation is only checked over one
// of them. To compute the correct statistics, multiply the statistics for each trace by
// the number of "occurrences" of said trace in the log.
int trace_count = traces->count(*each);
trace_stat.support = trace_stat.support * trace_count;
trace_stat.support_potential = trace_stat.support_potential * trace_count;
global_stat = statistic(global_stat, trace_stat);
// if confidence threshold is 1.0, an instantiation becomes invalid on the first unsatisfied trace,
// so we short-circuit after flagging the current instantiation as invalid.
if (c_settings.conf_t == 1.0 && !trace_stat.is_satisfied) {
valid = false;
break;
}
// in non-global setting, an instantiation becomes invalid on the first trace where a threshold is
// unsatisfied, so we short-circuit after flagging the current instantiation as invalid.
if (!c_settings.use_global_t && !c_settings.satisfies_thresholds(trace_stat)) {
valid = false;
break;
}
// in global 0-conf threshold setting, an instantiation becomes valid on the full trace set once
// every threshold have been satisfied, so we short-circuit after flagging the current instantiation as valid.
if (c_settings.use_global_t && c_settings.conf_t == 0.0) {
if (!c_settings.compute_full_stats && c_settings.satisfies_thresholds(global_stat)) {
valid = true;
break;
}
}
}
...
}
These short circuits occur at the level of the trace-set, and determine whether the final "validity" of the instantiation is fixed without needing to look at the remaining traces in the set.
For example, in the vanilla configuration*, the checker can short circuit on the first unsatisfied trace and know that no matter what occurs in the remaining traces, the instantiation will in the end be invalid.
The second form of short-circuit optimizations occur at the level of the individual traces, and determine when the checker can stop and short-circuit in the middle of a trace because it knows the final validity of the instantiation.
For example, suppose that we configure our checker with a confidence threshold of 0 and a support threshold of 5. The checker can then short circuit on a trace as soon as the accumulated support becomes equal to 5 and know that no matter what occurs in the remaining trace, the instantiation will in the end be valid.
The use of short-circuit optimizations in our checker while improving performance, makes certain functionalities unable to be implemented. For example, it would be nice when using a support or support-potential filter, to output along with our findings, the number of satisfied instantiations which had been filtered out. However, this would require us to know which instantiations were satisfied, which short circuiting does not allow us to do.
- We refer to the default setting of the checker as the vanilla configuration, which consists of a confidence threshold of 1.0, support and support-potential thresholds of 0, and all flags (i.e. use_global_t and compute_full_stats) turned off.