Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework query handling #184

Merged
merged 47 commits into from
Apr 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
3c5291c
QuerySource
frensing Oct 24, 2022
3134afb
QuerySet
frensing Oct 24, 2022
2b9266f
QuerySelector
frensing Oct 24, 2022
c534cd3
FileSeparatorQuerySource
frensing Oct 24, 2022
f829e41
FolderQuerySource
frensing Oct 24, 2022
f533fea
FileLineQuerySourceTest
frensing Oct 24, 2022
7ee76ec
remove getContent from QuerySet
frensing Oct 24, 2022
746c2ec
QuerySelector and LinearQuerySelectorTest
frensing Oct 24, 2022
315df06
QueryHandler
frensing Oct 24, 2022
884cd81
QueryHandler add folder test
frensing Oct 24, 2022
aecae1a
add hashcode and triplestats generation
frensing Oct 25, 2022
b0e23f2
gitignore
frensing Oct 25, 2022
f2a5699
cleanup
frensing Oct 25, 2022
e6de3d6
fix order evaluation
frensing Oct 27, 2022
75fece4
refactor httpworkers to use new query handling
frensing Oct 31, 2022
0d17a69
refactor cliworkers to use new query handling
frensing Oct 31, 2022
f8d53b0
refactor stresstest
frensing Oct 31, 2022
6b5a7be
PatternHandler and remove of old QueryHandler
frensing Nov 4, 2022
c23cff3
PatternHandler and remove of old QueryHandler
frensing Nov 4, 2022
2228f4f
update QueryHandler to init pattern
frensing Nov 4, 2022
708a3f7
add QueryHandlerTest with PatternHandler
frensing Nov 4, 2022
4f03a62
add requested changes
frensing Nov 7, 2022
39fa000
rm tripleStats todo
frensing Nov 7, 2022
3c1b9f0
rm unused method
frensing Nov 7, 2022
b145eb3
override hashCode method
frensing Nov 7, 2022
35a8d42
documentation and version update
frensing Nov 13, 2022
3be9745
javadoc
frensing Nov 13, 2022
e9b40b1
refactor TypedFactory create method
frensing Nov 13, 2022
f6281fb
reformat schema-file
nck-mlcnv Mar 8, 2023
b4ce0f1
update configuration file schema
nck-mlcnv Mar 8, 2023
e91b181
added endpoint as requirement for pattern key in the schema file
nck-mlcnv Mar 17, 2023
3d1400a
add missing dependency for PatternHandler
nck-mlcnv Mar 17, 2023
6a6e9e8
add javadocs for PatternHandler
nck-mlcnv Mar 17, 2023
1897018
fix the tutorial page in the documentation
nck-mlcnv Mar 18, 2023
09ad332
update README.md
nck-mlcnv Mar 22, 2023
7e79bb4
update docs
nck-mlcnv Mar 23, 2023
8950776
ignore cli tests
nck-mlcnv Mar 31, 2023
f7db189
update ci badge
nck-mlcnv Mar 31, 2023
06ba515
remove parenthesis
nck-mlcnv Apr 5, 2023
5f6f6ce
fix spelling in javadocs
nck-mlcnv Apr 5, 2023
b28283d
change .gitignore
nck-mlcnv Apr 5, 2023
6b07d04
remove SPARQLWorker
nck-mlcnv Apr 5, 2023
99d40de
fix javadocs and rename the method "initPattern"
nck-mlcnv Apr 5, 2023
3228e0c
remove unnecessary abstraction
nck-mlcnv Apr 5, 2023
470b938
rename abstract classes
nck-mlcnv Apr 5, 2023
9bf549d
rename QuerySet to QueryList and the method getQueryAtPos to getQuery
nck-mlcnv Apr 5, 2023
92498c8
update the query handling development doc page
nck-mlcnv Apr 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 10 additions & 37 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ tmp_ser

**/queryInstances/*

# Created by https://www.toptal.com/developers/gitignore/api/java,maven,intellij,eclipse
# Edit at https://www.toptal.com/developers/gitignore?templates=java,maven,intellij,eclipse
# Created by https://www.toptal.com/developers/gitignore/api/java,maven,intellij+all,eclipse
# Edit at https://www.toptal.com/developers/gitignore?templates=java,maven,intellij+all,eclipse

### Eclipse ###
.metadata
Expand Down Expand Up @@ -74,7 +74,7 @@ local.properties
# Spring Boot Tooling
.sts4-cache/

### Intellij ###
### Intellij+all ###
# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider
# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839

Expand Down Expand Up @@ -153,39 +153,14 @@ fabric.properties
# Android studio 3.1+ serialized cache file
.idea/caches/build_file_checksums.ser

### Intellij Patch ###
# Comment Reason: https://github.com/joeblau/gitignore.io/issues/186#issuecomment-215987721
### Intellij+all Patch ###
# Ignore everything but code style settings and run configurations
# that are supposed to be shared within teams.

# *.iml
# modules.xml
# .idea/misc.xml
# *.ipr

# Sonarlint plugin
# https://plugins.jetbrains.com/plugin/7973-sonarlint
.idea/**/sonarlint/

# SonarQube Plugin
# https://plugins.jetbrains.com/plugin/7238-sonarqube-community-plugin
.idea/**/sonarIssues.xml

# Markdown Navigator plugin
# https://plugins.jetbrains.com/plugin/7896-markdown-navigator-enhanced
.idea/**/markdown-navigator.xml
.idea/**/markdown-navigator-enh.xml
.idea/**/markdown-navigator/
.idea/*

# Cache file creation bug
# See https://youtrack.jetbrains.com/issue/JBR-2257
.idea/$CACHE_FILE$

# CodeStream plugin
# https://plugins.jetbrains.com/plugin/12206-codestream
.idea/codestream.xml

# Azure Toolkit for IntelliJ plugin
# https://plugins.jetbrains.com/plugin/8053-azure-toolkit-for-intellij
.idea/**/azureSettings.xml
!.idea/codeStyles
!.idea/runConfigurations

### Java ###
# Compiled class file
Expand Down Expand Up @@ -232,6 +207,4 @@ buildNumber.properties
# JDT-specific (Eclipse Java Development Tools)
.classpath

# End of https://www.toptal.com/developers/gitignore/api/java,maven,intellij,eclipse


# End of https://www.toptal.com/developers/gitignore/api/java,maven,intellij+all,eclipse
260 changes: 112 additions & 148 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,148 +1,112 @@
[![GitLicense](https://gitlicense.com/badge/dice-group/IGUANA)](https://gitlicense.com/license/dice-group/IGUANA)
![Java CI with Maven](https://github.com/dice-group/IGUANA/workflows/Java%20CI%20with%20Maven/badge.svg)[![BCH compliance](https://bettercodehub.com/edge/badge/AKSW/IGUANA?branch=master)](https://bettercodehub.com/)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/9668460dd04c411fab8bf5ee9c161124)](https://www.codacy.com/app/TortugaAttack/IGUANA?utm_source=github.com&utm_medium=referral&utm_content=AKSW/IGUANA&utm_campaign=Badge_Grade)
[![Project Stats](https://www.openhub.net/p/iguana-benchmark/widgets/project_thin_badge.gif)](https://www.openhub.net/p/iguana-benchmark)


# IGUANA

<img src = "https://github.com/dice-group/IGUANA/raw/develop/images/IGUANA_logo.png" alt = "IGUANA Logo" width = "400" align = "center">

## ABOUT


Semantic Web is becoming more important and it's data is growing each day. Triple stores are the backbone here, managing these data.
Hence it is very important that the triple store must scale on the data and can handle several users.
Current Benchmark approaches could not provide a realistic scenario on realistic data and could not be adjustet for your needs very easily.
Additionally Question Answering systems and Natural Language Processing systems are becoming more and more popular and thus needs to be stresstested as well.
Further on it was impossible to compare results for different benchmarks.

Iguana is an an Integerated suite for benchmarking read/write performance of HTTP endpoints and CLI Applications.</br> which solves all these issues.
It provides an enviroment which ...


+ ... is highly configurable
+ ... provides a realistic scneario benchmark
+ ... works on every dataset
+ ... works on SPARQL HTTP endpoints
+ ... works on HTTP Get & Post endpoints
+ ... works on CLI applications
+ and is easily extendable


For further Information visit

[iguana-benchmark.eu](http://iguana-benchmark.eu)

[Documentation](http://iguana-benchmark.eu/docs/3.3/)


# Getting Started

# Prerequisites

You need to install Java 11 or greater.
In Ubuntu you can install these using the following commands

```
sudo apt-get install java
```

# Iguana Modules

Iguana consists of two modules

1. **corecontroller**: This will benchmark the systems
2. **resultprocessor**: This will calculate the Metrics and save the raw benchmark results

## **corecontroller**

The **corecontroller** will benchmark your system. It should be started on the same machine the is started.

## **resultprocessor**

The **resultprocessor** will calculate the metrics.
By default it stores its result in a ntriple file. But you may configure it, to write the results directly to a Triple Store.
On the processing side, it calculates various metrics.

Per run metrics:
* Query Mixes Per Hour (QMPH)
* Number of Queries Per Hour (NoQPH)
* Number of Queries (NoQ)
* Average Queries Per Second (AvgQPS)

Per query metrics:
* Queries Per Second (QPS)
* Number of successful and failed queries
* result size
* queries per second
* sum of execution times

You can change these in the Iguana Benchmark suite config.

If you use the [basic configuration](https://github.com/dice-group/IGUANA/blob/master/example-suite.yml), it will save all mentioned metrics to a file called `results_{{DATE_RP_STARTED}}.nt`


# Setup Iguana

## Download
Please download the release zip **iguana-x.y.z.zip** from the newest release available [here](https://github.com/dice-group/IGUANA/releases/latest):

```
mkdir iguana
wget https://github.com/dice-group/IGUANA/releases/download/v3.3.2/iguana-3.3.2.zip
unzip iguana-3.3.2.zip
```


It contains the following files:

* iguana.corecontroller-X.Y.Z.jar
* start-iguana.sh
* example-suite.yml

# Run Your Benchmarks

## Create a Configuration

You can use the [basic configuration](https://github.com/dice-group/IGUANA/blob/master/example-suite.yml) we provide and modify it to your needs.
For further information please visit our [configuration](http://iguana-benchmark.eu/docs/3.2/usage/configuration/) and [Stresstest](http://iguana-benchmark.eu/docs/3.0/usage/stresstest/) wiki pages. For a detailed, step-by-step instruction please attend our [tutorial](http://iguana-benchmark.eu/docs/3.2/usage/tutorial/).



## Execute the Benchmark

Use the start script
```
./start-iguana.sh example-suite.yml
```
Now Iguana will execute the example benchmark suite configured in the example-suite.yml file


# How to Cite

```bibtex
@InProceedings{10.1007/978-3-319-68204-4_5,
author="Conrads, Lixi
and Lehmann, Jens
and Saleem, Muhammad
and Morsey, Mohamed
and Ngonga Ngomo, Axel-Cyrille",
editor="d'Amato, Claudia
and Fernandez, Miriam
and Tamma, Valentina
and Lecue, Freddy
and Cudr{\'e}-Mauroux, Philippe
and Sequeda, Juan
and Lange, Christoph
and Heflin, Jeff",
title="Iguana: A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores",
booktitle="The Semantic Web -- ISWC 2017",
year="2017",
publisher="Springer International Publishing",
address="Cham",
pages="48--65",
abstract="The performance of triples stores is crucial for applications driven by RDF. Several benchmarks have been proposed that assess the performance of triple stores. However, no integrated benchmark-independent execution framework for these benchmarks has yet been provided. We propose a novel SPARQL benchmark execution framework called Iguana. Our framework complements benchmarks by providing an execution environment which can measure the performance of triple stores during data loading, data updates as well as under different loads and parallel requests. Moreover, it allows a uniform comparison of results on different benchmarks. We execute the FEASIBLE and DBPSB benchmarks using the Iguana framework and measure the performance of popular triple stores under updates and parallel user requests. We compare our results (See https://doi.org/10.6084/m9.figshare.c.3767501.v1) with state-of-the-art benchmarking results and show that our benchmark execution framework can unveil new insights pertaining to the performance of triple stores.",
isbn="978-3-319-68204-4"
}
```
# IGUANA

[![ci](https://github.com/dice-group/IGUANA/actions/workflows/ci.yml/badge.svg)](https://github.com/dice-group/IGUANA/actions/workflows/ci.yml)

<p align="center">
<img src="https://github.com/dice-group/IGUANA/raw/develop/images/IGUANA_logo.png" alt="IGUANA Logo" width="200">
</p>
Iguana is an integrated suite for benchmarking the read/write performance of HTTP endpoints and CLI Applications.

It provides an environment which ...

* is highly configurable
* provides a realistic scenario benchmark
* works on every dataset
* works on SPARQL HTTP endpoints
* works on HTTP Get & Post endpoints
* works on CLI applications
* and is easily extendable

For further information visit:
- [iguana-benchmark.eu](http://iguana-benchmark.eu)
- [Documentation](http://iguana-benchmark.eu/docs/3.3/)

## Iguana Modules

Iguana consists of two modules
- **corecontroller** - this will benchmark the systems
- **resultprocessor** - this will calculate the metrics and save the raw benchmark results

### Available metrics

Per run metrics:
* Query Mixes Per Hour (QMPH)
* Number of Queries Per Hour (NoQPH)
* Number of Queries (NoQ)
* Average Queries Per Second (AvgQPS)

Per query metrics:
* Queries Per Second (QPS)
* number of successful and failed queries
* result size
* queries per second
* sum of execution times

## Setup Iguana

### Prerequisites

In order to run Iguana, you need to have `Java 11`, or greater, installed on your system.

### Download
Download the newest release of Iguana [here](https://github.com/dice-group/IGUANA/releases/latest), or run on a unix shell:

```sh
wget https://github.com/dice-group/IGUANA/releases/download/v4.0.0/iguana-4.0.0.zip
unzip iguana-4.0.0.zip
```

The zip file contains the following files:

* `iguana-X.Y.Z.jar`
* `start-iguana.sh`
* `example-suite.yml`

### Create a Configuration

You can use the provided example configuration and modify it to your needs.
For further information please visit our [configuration](http://iguana-benchmark.eu/docs/3.2/usage/configuration/) and [Stresstest](http://iguana-benchmark.eu/docs/3.0/usage/stresstest/) wiki pages.

For a detailed, step-by-step instruction through a benchmarking example, please visit our [tutorial](http://iguana-benchmark.eu/docs/3.2/usage/tutorial/).

### Execute the Benchmark

Start Iguana with a benchmark suite (e.g. the example-suite.yml) either by using the start script:

```sh
./start-iguana.sh example-suite.yml
```

or by directly executing the jar-file:

```sh
java -jar iguana-x-y-z.jar example-suite.yml
```

# How to Cite

```bibtex
@InProceedings{10.1007/978-3-319-68204-4_5,
author="Conrads, Lixi
and Lehmann, Jens
and Saleem, Muhammad
and Morsey, Mohamed
and Ngonga Ngomo, Axel-Cyrille",
editor="d'Amato, Claudia
and Fernandez, Miriam
and Tamma, Valentina
and Lecue, Freddy
and Cudr{\'e}-Mauroux, Philippe
and Sequeda, Juan
and Lange, Christoph
and Heflin, Jeff",
title="Iguana: A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores",
booktitle="The Semantic Web -- ISWC 2017",
year="2017",
publisher="Springer International Publishing",
address="Cham",
pages="48--65",
abstract="The performance of triples stores is crucial for applications driven by RDF. Several benchmarks have been proposed that assess the performance of triple stores. However, no integrated benchmark-independent execution framework for these benchmarks has yet been provided. We propose a novel SPARQL benchmark execution framework called Iguana. Our framework complements benchmarks by providing an execution environment which can measure the performance of triple stores during data loading, data updates as well as under different loads and parallel requests. Moreover, it allows a uniform comparison of results on different benchmarks. We execute the FEASIBLE and DBPSB benchmarks using the Iguana framework and measure the performance of popular triple stores under updates and parallel user requests. We compare our results (See https://doi.org/10.6084/m9.figshare.c.3767501.v1) with state-of-the-art benchmarking results and show that our benchmark execution framework can unveil new insights pertaining to the performance of triple stores.",
isbn="978-3-319-68204-4"
}
```
28 changes: 15 additions & 13 deletions docs/about.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
# Iguana
Iguana is an an Integerated suite for benchmarking read/write performance of HTTP endpoints and CLI Applications.
Semantic Web is becoming more important and it's data is growing each day. Triple stores are the backbone here, managing these data. Hence it is very important that the triple store must scale on the data and can handle several users. Current Benchmark approaches could not provide a realistic scenario on realistic data and could not be adjustet for your needs very easily. Additionally Question Answering systems and Natural Language Processing systems are becoming more and more popular and thus needs to be stresstested as well. Further on it was impossible to compare results for different benchmarks.
Iguana is an integrated suite for benchmarking the read/write performance of HTTP endpoints and CLI Applications.

Iguana tries to solve all these issues. It provides an enviroment which ...
Semantic Web is becoming more important and its data is growing each day. Triple stores are the backbone here, managing these data. Hence, it is very important that the triple store must scale on the data and can handle several users. Current Benchmark approaches could not provide a realistic scenario on realistic data and could not be adjusted for your needs very easily.

Additionally, Question Answering systems and Natural Language Processing systems are becoming more and more popular and thus need to be stresstested as well. Further on it was impossible to compare results for different benchmarks.

Iguana tries to solve all these issues. It provides an environment which ...

* is highly configurable
* provides a realistic scneario benchmark
* provides a realistic scenario benchmark
* works on every dataset
* works on SPARQL HTTP endpoints
* works on HTTP Get & Post endpoints
Expand All @@ -14,22 +17,21 @@ Iguana tries to solve all these issues. It provides an enviroment which ...

## What is Iguana

Iguana is a HTTP and CLI read/write performance benchmark framework suite.
It can stresstest HTTP get and post endpoints as well as CLI applications using a bunch of simulated users which will bombard the endpoint using queries.
Queries can be anything. SPARQL, SQL, Text and anything else you can fit in one line.
Iguana is an HTTP and CLI read/write performance benchmark framework suite.
It can stresstest HTTP GET and POST endpoints as well as CLI applications using a bunch of simulated users which will flood the endpoint using queries.
Queries can be anything. SPARQL, SQL, Text, etc.

## What can be benchmarked

Iguana is capable of benchmarking and stresstesting the following applications
Iguana is capable of benchmarking and stresstesting the following applications:

* HTTP GET and POST endpoint (e.g. Triple Stores, REST Services, Question Answering endpoints)
* HTTP GET and POST endpoints (e.g. Triple Stores, REST Services, Question Answering endpoints)
* CLI Applications which either
* exit after every query
* or awaiting input after each query
* await for input after each query

## What Benchmarks are possible

Every simulated User (named Worker in the following) gets a set of queries.
These queries have to be saved in one file, whereas each query is one line.
Hence everything you can fit in one line (e.g a SPARQL query, a text question, an RDF document) can be used as a query and a set of these queries represent the benchmark.
Every simulated user (named worker in the following) gets a set of queries.
These queries (e.g. SPARQL queries, text questions, RDF documents) can be saved in a single file or in a folder with multiple files. A set of these queries represent the benchmark.
Iguana will then let every Worker execute these queries against the endpoint.
Loading