Rework result processing (#219)

dice-group · Aug 7, 2023 · f44b101 · f44b101
1 parent a64b163
commit f44b101
Show file tree

Hide file tree

Showing 109 changed files with 2,147 additions and 2,804 deletions.
diff --git a/README.md b/README.md
@@ -21,23 +21,19 @@ For further information visit:
 - [iguana-benchmark.eu](http://iguana-benchmark.eu)
 - [Documentation](http://iguana-benchmark.eu/docs/3.3/)
 
-## Iguana Modules
-
-Iguana consists of two modules
-- **corecontroller** - this will benchmark the systems
-- **resultprocessor** - this will calculate the metrics and save the raw benchmark results
-
 ### Available metrics
 
 Per run metrics:
 * Query Mixes Per Hour (QMPH)
 * Number of Queries Per Hour (NoQPH)
 * Number of Queries (NoQ)
 * Average Queries Per Second (AvgQPS)
+* Penalized Average Queries Per Second (PAvgQPS)
 
 Per query metrics:
 * Queries Per Second (QPS)
-* number of successful and failed queries
+* Penalized Queries Per Second (PQPS)
+* Number of successful and failed queries
 * result size
 * queries per second
 * sum of execution times
@@ -46,7 +42,7 @@ Per query metrics:
 
 ### Prerequisites
 
-In order to run Iguana, you need to have `Java 11`, or greater, installed on your system.
+In order to run Iguana, you need to have `Java 17`, or greater, installed on your system.
 
 ### Download
 Download the newest release of Iguana [here](https://github.com/dice-group/IGUANA/releases/latest), or run on a unix shell:

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -51,9 +51,11 @@ Per run metrics:
 * Number of Queries Per Hour (NoQPH)
 * Number of Queries (NoQ)
 * Average Queries Per Second (AvgQPS)
+* Penalized Average Queries Per Second (PAvgQPS)
 
 Per query metrics:
 * Queries Per Second (QPS)
+* Penalized Queries Per Second (PQPS)
 * Number of successful and failed queries
 * result size
 * queries per second

diff --git a/docs/develop/extend-metrics.md b/docs/develop/extend-metrics.md
@@ -1,107 +1,57 @@
 # Extend Metrics
 
-To implement a new metric, create a new class that extends the abstract class `AbstractMetric`:
+To implement a new metric, create a new class that extends the abstract class `Metric`:
 
 ```java
 package org.benchmark.metric;
 
 @Shorthand("MyMetric")
-public class MyMetric extends AbstractMetric{
+public class MyMetric extends Metric {
 
-	@Override
-	public void receiveData(Properties p) {
-        // ...
-	}
-
-	@Override
-	public void close() {
-		callbackClose();
-		super.close();
-
-	}
-
-	protected void callbackClose() {
-        // your close method
-	}
+    public MyMetric() {
+        super("name", "abbreviation", "description");
+    }
 }
 ```
 
-## Receive Data
-
-This method will receive all the results during the benchmark. 
-
-You'll receive a few values regarding each query execution. Those values include the amount of time the execution took, if it succeeded, and if not, the reason why it failed, which can be either a timeout, a wrong HTTP Code or an unknown error.
-Further on you also receive the result size of the query.
-
-If your metric is a single value metric, you can use the `processData` method, which will automatically add each value together. 
-However, if your metric is query specific, you can use the `addDataToContainter` method. (Look at the [QPSMetric](https://github.com/dice-group/IGUANA/blob/master/iguana.resultprocessor/src/main/java/org/aksw/iguana/rp/metrics/impl/QPSMetric.java))
+You can then choose if the metric is supposed to be calculated for each Query, Worker
+or Task by implementing the appropriate interfaces: `QueryMetric`, `WorkerMetric`, `TaskMetric`.
 
-Be aware that both methods will save the results for each used worker. This allows the calculation of the overall metric, as well as the metric for each worker itself.
+You can also choose to implement the `ModelWritingMetric` interface, if you want your
+metric to create a special RDF model, that you want to be added to the result model.
 
-We will stick to the single-value metric for now.
-
-
-The following shows an example, that retrieves every possible value and saves the time and success:
+The following gives you an examples on how to work with the `data` parameter:
 
 ```java
-@Override
-public void receiveData(Properties p) {
-
-    double time = Double.parseDouble(p.get(COMMON.RECEIVE_DATA_TIME).toString());
-    long tmpSuccess = Long.parseLong(p.get(COMMON.RECEIVE_DATA_SUCCESS).toString());
-    long success = (tmpSuccess > 0) ? 1 : 0;
-    long failure = (success == 1) ? 0 : 1;
-    long timeout = (tmpSuccess == COMMON.QUERY_SOCKET_TIMEOUT) ? 1 : 0;
-    long unknown = (tmpSuccess == COMMON.QUERY_UNKNOWN_EXCEPTION) ? 1 : 0;
-    long wrongCode = (tmpSuccess == COMMON.QUERY_HTTP_FAILURE) ? 1 : 0;
-
-    if(p.containsKey(COMMON.RECEIVE_DATA_SIZE)) {
-        size = Long.parseLong(p.get(COMMON.RECEIVE_DATA_SIZE).toString());
+    @Override
+    public Number calculateTaskMetric(StresstestMetadata task, List<QueryExecutionStats>[][] data) {
+        for (WorkerMetadata worker : task.workers()) {
+            for (int i = 0; i < worker.noOfQueries(); i++) {
+                // This list contains every query execution statistics of one query
+                // from the current worker
+                List<QueryExecutionStats> execs = data[worker.workerID()][i];
+            }   
+        }
+        return BigInteger.ZERO;
     }
-
-    Properties results = new Properties();
-    results.put(TOTAL_TIME, time);
-    results.put(TOTAL_SUCCESS, success);
-
-    Properties extra = getExtraMeta(p);
-    processData(extra, results);
-}
-```
 
-## Close
-
-In this method you should calculate your metric and send the results.
-An example:
-
-```java
-protected void callbackClose() {
-    // create a model that contains the results 
-    Model m = ModelFactory.createDefaultModel();
-
-    Property property = getMetricProperty();
-    Double sum = 0.0;
-
-    // Go over each worker and add metric results to model
-    for(Properties key : dataContainer.keySet()){
-        Double totalTime = (Double) dataContainer.get(key).get(TOTAL_TIME);
-        Integer success = (Integer) dataContainer.get(key).get(TOTAL_SUCCESS);
-
-        Double noOfQueriesPerHour = hourInMS * success * 1.0 / totalTime;
-        sum += noOfQueriesPerHour;
-        Resource subject = getSubject(key);
-
-        m.add(getConnectingStatement(subject));
-        m.add(subject, property, ResourceFactory.createTypedLiteral(noOfQueriesPerHour));
+    @Override
+    public Number calculateWorkerMetric(WorkerMetadata worker, List<QueryExecutionStats>[] data) {
+        for (int i = 0; i < worker.noOfQueries(); i++) {
+            // This list contains every query execution statistics of one query
+            // from the given worker
+            List<QueryExecutionStats> execs = data[i];
+        }
+        return BigInteger.ZERO;
     }
 
-    // Add overall metric to model
-    m.add(getTaskResource(), property, ResourceFactory.createTypedLiteral(sum));
-
-    // Send data to storage
-    sendData(m);
-}
-```
-
-## Constructor 
-
-The constructor parameters are provided the same way as for the tasks. Thus, simply look at the [Extend Task](../extend-task) page.
+    @Override
+    @Nonnull
+    public Model createMetricModel(StresstestMetadata task, Map<String, List<QueryExecutionStats>> data) {
+        for (String queryID : task.queryIDS()) {
+            // This list contains every query execution statistics of one query from
+            // every worker that executed this querys
+            List<QueryExecutionStats> execs = data.get(queryID);
+        }
+    }
+```
diff --git a/docs/develop/extend-result-storages.md b/docs/develop/extend-result-storages.md
@@ -1,47 +1,23 @@
 # Extend Result Storages
 
-If you want to use a different storage other than RDF, you can implement a different storage solution. 
-
-The current implementation of Iguana is highly optimized for RDF, thus we recommend you to work on top of the `TripleBasedStorage` class:
+If you want to use a different storage other than RDF, you can implement a different storage solution.
 
 ```java
 package org.benchmark.storage;
 
 @Shorthand("MyStorage")
-public class MyStorage extends TripleBasedStorage {
-
-	@Override
-	public void commit() {
-
-	}
-
-	@Override
-	public String toString(){
-		return this.getClass().getSimpleName();
-	}
-}
-```
-
-## Commit
+public class MyStorage implements Storage {
 
-This method should take all the current results, store them, and remove them from the memory.
-
-You can access the results at the Jena Model `this.metricResults`. 
-
-For example:
-
-```java
-@Override
-public void commit() {
-    try (OutputStream os = new FileOutputStream(file.toString(), true)) {
-        RDFDataMgr.write(os, metricResults, RDFFormat.NTRIPLES);
-        metricResults.removeAll();
-    } catch (IOException e) {
-        LOGGER.error("Could not commit to NTFileStorage.", e);
+    @Override
+    public void storeResults(Model m) {
+        // method for storing model
     }
 }
 ```
 
+The method `storeResults` will be called at the end of the task. The model from
+the parameter contains the final result model for that task.
+
 ## Constructor 
 
 The constructor parameters are provided the same way as for the tasks. Thus, simply look at the [Extend Task](../extend-task) page.
diff --git a/docs/download.md b/docs/download.md
@@ -2,7 +2,7 @@
 
 ## Prerequisites
 
-You need to have Java 11 or higher installed. 
+You need to have Java 17 or higher installed. 
 
 
 In Ubuntu, you can install it by executing the following command: 

diff --git a/docs/shorthand-mapping.md b/docs/shorthand-mapping.md
@@ -1,6 +1,6 @@
 | Shorthand              | Class Name                                                |
 |------------------------|-----------------------------------------------------------|
-| Stresstest             | `org.aksw.iguana.cc.tasks.impl.Stresstest`                |
+| Stresstest             | `org.aksw.iguana.cc.tasks.stresstest.Stresstest`          |
 | ----------             | -------                                                   |
 | lang.RDF               | `org.aksw.iguana.cc.lang.impl.RDFLanguageProcessor`       | 
 | lang.SPARQL            | `org.aksw.iguana.cc.lang.impl.SPARQLLanguageProcessor`    | 
@@ -15,13 +15,16 @@
 | CLIInputPrefixWorker   | `org.aksw.iguana.cc.worker.impl.CLIInputPrefixWorker`     |
 | MultipleCLIInputWorker | `org.aksw.iguana.cc.worker.impl.MultipleCLIInputWorker`   |
 | ----------             | -------                                                   |
-| NTFileStorage          | `org.aksw.iguana.rp.storages.impl.NTFileStorage`          |
-| RDFFileStorage         | `org.aksw.iguana.rp.storages.impl.RDFFileStorage`         |
-| TriplestoreStorage     | `org.aksw.iguana.rp.storages.impl.TriplestoreStorage`     |
+| NTFileStorage          | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.NTFileStorage`          |
+| RDFFileStorage         | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.RDFFileStorage`         |
+| TriplestoreStorage     | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.TriplestoreStorage`     |
 | ----------             | -------                                                   |
-| QPS                    | `org.aksw.iguana.rp.metrics.impl.QPSMetric`               |
-| AvgQPS                 | `org.aksw.iguana.rp.metrics.impl.AvgQPSMetric`            |
-| NoQ                    | `org.aksw.iguana.rp.metrics.impl.NoQMetric`               |
-| NoQPH                  | `org.aksw.iguana.rp.metrics.impl.NoQPHMetric`             |
-| QMPH                   | `org.aksw.iguana.rp.metrics.impl.QMPHMetric`              |
-| EachQuery              | `org.aksw.iguana.rp.metrics.impl.EQEMetric`               |
+| QPS                    | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.QPS`               |
+| PQPS                   | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.PQPS`               |
+| AvgQPS                 | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.AvgQPS`            |
+| PAvgQPS                | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.PAvgQPS`            |
+| NoQ                    | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.NoQ`               |
+| NoQPH                  | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.NoQPH`             |
+| QMPH                   | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.QMPH`              |
+| AES                    | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.AggregatedExecutionStatistics`              |
+| EachQuery              | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.EachExecutionStatistic`               |
diff --git a/docs/usage/configuration.md b/docs/usage/configuration.md
@@ -25,7 +25,7 @@ A connection has the following items:
 * `updateEndpoint` - if your HTTP endpoint is an HTTP POST endpoint, you can set it with this item (optional)
 * `user` - for authentication purposes (optional)
 * `password` - for authentication purposes (optional)
-* `version` - sets the version of the tested triplestore; if this is set, the resource URI will be ires:name-version (optional)
+* `version` - sets the version of the tested triplestore (optional)
 
 At first, it might be confusing to set up both an `endpoint` and `updateEndpoint`, but it is used, when you want your test to perform read and write operations simultaneously, for example, to test the impact of updates on the read performance of your triple store.
 
@@ -190,17 +190,18 @@ The `metrics` setting lets Iguana know what metrics you want to include in the r
 Iguana supports the following metrics:
 
 * Queries Per Second (`QPS`)
+* Penalized Queries Per Second (`PQPS`)
 * Average Queries Per Second (`AvgQPS`)
+* Penalized Average Queries Per Second (`PAvgQPS`)
 * Query Mixes Per Hour (`QMPH`)
 * Number of Queries successfully executed (`NoQ`)
 * Number of Queries per Hour (`NoQPH`)
-* Each query execution (`EachQuery`) - experimental
+* Each Execution Statistic (`EachQuery`)
+* Aggregated Execution Statistics (`AES`)
 
 For more details on each of the metrics have a look at the [Metrics](../metrics) page.
 
-The `metrics` setting is optional and the default is set to every available metric, except `EachQuery`.
-
-Let's look at an example:
+The `metrics` setting is optional and the default is set to this:
 
 ```yaml
 metrics:
@@ -209,11 +210,10 @@ metrics:
   - className: "QMPH"
   - className: "NoQ"
   - className: "NoQPH"
+  - className: "AES"
 ```
 
-In this case we use every metric that Iguana has implemented. This is the default.
-
-However, you can also just use a subset of these metrics:
+You can also use a subset of these metrics:
 
 ```yaml
 metrics:

diff --git a/docs/usage/getting-started.md b/docs/usage/getting-started.md
@@ -22,7 +22,7 @@ Iguana will then let every Worker execute these queries against the endpoint.
 
 ## Prerequisites
 
-You need to have Java 11 or higher installed.
+You need to have Java 17 or higher installed.
 
 In Ubuntu you can install it by executing the following command:
 ```bash