Skip to content

Commit

Permalink
KEYCLOAK-7599 Improve handling of test datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
tkyjovsk authored and hmlnarik committed Jul 24, 2018
1 parent a6e4f4f commit ea8eaaf
Show file tree
Hide file tree
Showing 143 changed files with 6,544 additions and 318 deletions.
166 changes: 136 additions & 30 deletions testsuite/performance/README.datasets.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,152 @@
# Keycloak Performance Testsuite - Generating datasets
# Keycloak Datasets

## Provision Keycloak Server

## Generating a set of datasets for multiple realms
Before generating data it is necessary to provision/start Keycloak server. This can
be done automatically by running:

The first dataset is small and is created quickly. Building of each subsequent dataset continues on top
of the previous dataset.
```
cd testsuite/performance
mvn clean install
mvn verify -P provision
```
To tear down the system after testing run:
```
mvn verify -P teardown
```
The teardown step will delete the database as well so it is possible to use it between generating different datasets.

Datasets are created with a specific released server version (rather than a snapshot) in order to be
usable with later releases - newer server version should be able to migrate schema from any previous release.
It is also possible to start the server externally (manually). In that case it is necessary
to provide information in file `tests/target/provisioned-system.properties`.
See the main README for details.

We use 10 concurrent threads, which is enough to saturate a
dual core machine. For quad-core you can try to double the number of workers.
## Generate Data

To generate the *default dataset* run:
```
cd testsuite/performance
mvn verify -P generate-data
```

mvn clean install -Dserver.version=4.0.0.Beta1
To generate a *specific dataset* from within the project run:
```
mvn verify -P generate-data -Ddataset=<NAMED_DATASET>
```
This will load dataset properties from `tests/src/test/resources/dataset/${dataset}.properties`.

mvn verify -Pteardown
mvn verify -Pprovision
mvn verify -Pgenerate-data -Ddataset=10r100u1c -DnumOfWorkers=10
mvn verify -Pexport-dump -Ddataset=10r100u1c
To generate a specific dataset from a *custom properties file* run:
```
mvn verify -P generate-data -Ddataset.properties.file=<FULL_PATH_TO_PROPERTIES_FILE>
```

mvn verify -Pgenerate-data -Ddataset=20r100u1c -DstartAtRealmIdx=10 -DnumOfWorkers=10
mvn verify -Pexport-dump -Ddataset=20r100u1c
To delete a dataset run:
```
mvn verify -P generate-data -Ddataset=… -Ddelete=true
```
This will delete all realms specified by the dataset.

mvn verify -Pgenerate-data -Ddataset=50r100u1c -DstartAtRealmIdx=20 -DnumOfWorkers=10
mvn verify -Pexport-dump -Ddataset=50r100u1c

mvn verify -Pgenerate-data -Ddataset=200r100u1c -DstartAtRealmIdx=50 -DnumOfWorkers=10
mvn verify -Pexport-dump -Ddataset=200r100u1c
## Indexed Model

mvn verify -Pgenerate-data -Ddataset=500r100u1c -DstartAtRealmIdx=200 -DnumOfWorkers=10
mvn verify -Pexport-dump -Ddataset=500r100u1c
```
The model is hierarchical with the parent-child relationships determined by primary foreign keys of entities.

If the dataset dump file is not available locally but it's known that the dataset for specific version exists on the server
it can be retrieved by specifying a proper server version again. For example:
```
mvn verify -Pteardown
mvn clean install
mvn verify -Pprovision
mvn verify -Pimport-dump -Ddataset=20r100u1c -Dserver.version=4.0.0.Beta1
Size of the dataset is determined by specifying a "count per parent" parameter for each entity.

Number of mappings between entities created by the primary "count per parent" parameters
can be speicied by "count per other entity" parameters.

Each nested entity has a unique index which identifies it inside its parent entity.

For example:
- Realm X --> Client Y --> Client Role Z
- Realm X --> Client Y --> Resource Server --> Resource Z
- Realm X --> User Y
- etc.

Hash code of each entity is computed based on its index coordinates within the model and its class name.

Each entity holds entity representation, and a list of mappings to other entities in the indexed model.
The attributes and mappings are initialized by a related *entity template* class.
Each entity class also acts as a wrapper around a Keycloak Admin Client using it
to provide CRUD operations for its entity.

The `id` attribute in the entity representation is set upon entity creation, or in case
an already initialized entity was removed from LRU cache it is reloaded from the server.
This may happen if the number of entities is larger than entity cache size. (see below)

### Attribute Templating

Attributes of each supported entity representation can be set via FreeMarker templates.
The process is based on templates defined in a properties configuration file.

The first template in the list can use the `index` of the entity and any attributes of its parent entity.
Each subsequent attribute template can use any previously set attribute values.

Note: Output of FreeMarker engine is always a String. Transition to the actual type
of the attribute is done with the Jackson 2.9+ parser using `ObjectMapper.update()` method
which allows a gradual updates of an existing Java object.

### Randomness

Randomness in the indexed model is deterministic (pseudorandom) because the
random seeds are based on deterministic hash codes.

There are 2 types of seeds: one is for using randoms in the FreeMarker templates
via methods `indexBasedRandomInt(int bound)` and `indexBasedRandomBool(int percentage)`.
It is based on class of the current entity + hash code of its parent entity.

The other seed is for generating mappings to other entities which are just
random sequences of integer indexes. This is based on hash code of the current entity.

### Generator Settings

#### Timeouts
- `queue.timeout`: How long to wait for an entity to be processed by a thread-pool executor. Default is `60` seconds.
You might want to increase this setting when deleting many realms with many nested entities using a low number of workers.
- `shutdown.timeout`: How long to wait for the executor thread-pool to shut down. Default is `60` seconds.

#### Caching and Memory
- `template.cache.size`: Size of cache of FreeMarker template models. Default is `10000`.
- `randoms.cache.size`: Size of cache of random integer sequences which are used for mappings between entities. Default is `10000`.
- `entity.cache.size`: Size of cache of initialized entities. Default is `100000`.
- `max.heap`: Max heap size of the data generator JVM.


## Notes:

- Mappings are random so it can sometimes happen that the same mappings are generated multiple times.
Only distinct mappings are created.
This means for example that if you specify `realmRolesPerUser=5` it can happen
that only 4 or less roles will be actually mapped.

There is an option to use unique random sequences but is is disabled right now
because checking for uniqueness is CPU-intensive.

- Mapping of client roles to a user right now is determined by a single parameter: `clientRolesPerUser`.

Actually created mappings -- each of which contains specific client + a set of its roles -- is created
based on the list of randomly selected client roles of all clients in the realm.
This means the count of the actual client mappings isn't predictable.

That would require specifying 2 parameters: `clientsPerUser` and `clientRolesPerClientPerUser`
which would say how many clients a user has roles assigned from, and the number of roles per each of these clients.

- Number of resource servers depends on how the attribute `authorizationServicesEnabled`
is set for each client. This means the number isn't specified by any "perRealm" parameter.
If this is needed it can be implemented via a random mapping from a resource server entity
to a set of existing clients in a similar fashion to how a resource is selected for each resource permission.

- The "resource type" attribute for each resource and resource-based permission defaults to
the default type of the parent resource server.
If it's needed a separate abstract/non-persistable entity ResourceType can be created in the model
to represent a set of resource types. The "resource type" attributes can then be set based on random mappings into this set.

- Generating large number of users can take a long time with the default realm settings
which have the password hashing iterations set to a default value of 27500.
If you wish to speed this process up decrease the value of `hashIterations()` in attribute `realm.passwordPolicy`.

Note that this will also significantly affect the performance results of the tests because
password hashing takes a major part of the server's compute resources. The results may
improve even by a factor of 10 or higher when the hashing is set to the minimum value of 1 itreration.
However it's on the expense of security.

```
55 changes: 29 additions & 26 deletions testsuite/performance/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ mvn clean install
# Make sure your Docker daemon is running THEN
mvn verify -Pprovision
mvn verify -Pgenerate-data -Ddataset=100u2c -DnumOfWorkers=10 -DhashIterations=100
mvn verify -Ptest -Ddataset=100u2c -DusersPerSec=2 -DrampUpPeriod=10 -DuserThinkTime=0 -DbadLoginAttempts=1 -DrefreshTokenCount=1 -DmeasurementPeriod=60 -DfilterResults=true
mvn verify -Pgenerate-data -Ddataset=1r_10c_100u -DnumOfWorkers=10
mvn verify -Ptest -Ddataset=1r_10c_100u -DusersPerSec=2 -DrampUpPeriod=10 -DuserThinkTime=0 -DbadLoginAttempts=1 -DrefreshTokenCount=1 -DmeasurementPeriod=60 -DfilterResults=true
```

Now open the generated report in a browser - the link to .html file is displayed at the end of the test.
Expand All @@ -39,7 +39,7 @@ mvn verify -Pteardown

You can perform all phases in a single run:
```
mvn verify -Pprovision,generate-data,test,teardown -Ddataset=100u2c -DnumOfWorkers=10 -DhashIterations=100 -DusersPerSec=4 -DrampUpPeriod=10
mvn verify -Pprovision,generate-data,test,teardown -Ddataset=1r_10c_100u -DnumOfWorkers=10 -DusersPerSec=4 -DrampUpPeriod=10
```
Note: The order in which maven profiles are listed does not determine the order in which profile related plugins are executed. `teardown` profile always executes last.

Expand Down Expand Up @@ -103,6 +103,23 @@ it is necessary to update the generated Keycloak server configuration (inside `k
adding a `clean` goal to the provisioning command like so: `mvn clean verify -Pprovision …`. It is *not* necessary to update this configuration
when switching between `singlenode` and `cluster` deployments.

#### Manual Provisioning

If you want to generate data or run the test against an already running instance of Keycloak server
you need to provide information about the system in a properties file.

Create file: `tests/target/provisioned-system.properties` with the following properties:
```
keycloak.frontend.servers=http://localhost:8080/auth
keycloak.admin.user=admin
keycloak.admin.password=admin
```
and replace the values with your actual information. Then it will be possible to run tasks: `generate-data` and `test`.

The tasks: `export-dump`, `import-dump` and `collect` (see below) are only available with the automated provisioning
because they require direct access to the provisioned services.


### Collect Artifacts

Usage: `mvn verify -Pcollect`
Expand All @@ -122,45 +139,31 @@ because it contains the `provisioned-system.properties` with information about t

### Generate Test Data

Usage: `mvn verify -P generate-data [-Ddataset=NAMED_PROPERTY_SET] [-DnumOfWorkers=N]`. The default dataset is `2u2c`. Workers default to `1`.
Usage: `mvn verify -P generate-data [-Ddataset=NAMED_PROPERTY_SET] [-DnumOfWorkers=N]`. Workers default to `1`.

The parameters are loaded from `tests/parameters/datasets/${dataset}.properties` file.
Individual properties can be overriden from command line via `-D` params.
The parameters are loaded from `tests/src/test/resources/dataset/${dataset}.properties` file with `${dataset}` defaulting to `default`.

To use a custom properties file specify `-Ddataset.properties.file=ABSOLUTE_PATH_TO_FILE` instead of `-Ddataset`.

To generate data using a different version of Keycloak Admin Client set property `-Dserver.version=SERVER_VERSION` to match the version of the provisioned server.

#### Dataset Parameters

| Property | Description | Value in the Default Dataset |
| --- | --- | --- |
| `numOfRealms` | Number of realms to be created. | `1` |
| `usersPerRealm` | Number of users per realm. | `2` |
| `clientsPerRealm` | Number of clients per realm. | `2` |
| `realmRoles` | Number of realm-roles per realm. | `2` |
| `realmRolesPerUser` | Number of realm-roles assigned to a created user. Has to be less than or equal to `realmRoles`. | `2` |
| `clientRolesPerUser` | Number of client-roles assigned to a created user. Has to be less than or equal to `clientsPerRealm * clientRolesPerClient`. | `2` |
| `clientRolesPerClient` | Number of client-roles per created client. | `2` |
| `hashIterations` | Number of password hashing iterations. | `27500` |

To delete the generated dataset add `-Ddelete=true` to the above command. Dataset is deleted by deleting individual realms.

#### Examples:
- Generate the default dataset. `mvn verify -P generate-data`
- Generate the `100u2c` dataset. `mvn verify -P generate-data -Ddataset=100u2c`
- Generate the `100u2c` dataset but override some parameters. `mvn verify -P generate-data -Ddataset=100u2c -DclientRolesPerUser=5 -DclientRolesPerClient=5`
- Generate the `1r_10c_100u` dataset. `mvn verify -P generate-data -Ddataset=1r_10c_100u`

#### Export Database

To export the generated data to a data-dump file enable profile `-P export-dump`. This will create a `${DATASET}.sql.gz` file next to the dataset properties file.

Example: `mvn verify -P generate-data,export-dump -Ddataset=100u2c`
Example: `mvn verify -P generate-data,export-dump -Ddataset=1r_10c_100u`

#### Import Database

To import data from an existing data-dump file use profile `-P import-dump`.

Example: `mvn verify -P import-dump -Ddataset=100u2c`
Example: `mvn verify -P import-dump -Ddataset=1r_10c_100u`

If the dump file doesn't exist locally the script will attempt to download it from `${db.dump.download.site}` which defaults to `https://downloads.jboss.org/keycloak-qe/${server.version}`
with `server.version` defaulting to `${project.version}` from `pom.xml`.
Expand Down Expand Up @@ -221,11 +224,11 @@ When running the tests it is necessary to define the dataset to be used.

- Run test specific test and dataset parameters:

`mvn verify -P test -Dtest.properties=oidc-login-logout -Ddataset=100u2c`
`mvn verify -P test -Dtest.properties=oidc-login-logout -Ddataset=1r_10c_100u`

- Run test with specific test and dataset parameters, overriding some from command line:

`mvn verify -P test -Dtest.properties=admin-console -Ddataset=100u2c -DrampUpPeriod=30 -DwarmUpPeriod=60 -DusersPerSec=0.3`
`mvn verify -P test -Dtest.properties=admin-console -Ddataset=1r_10c_100u -DrampUpPeriod=30 -DwarmUpPeriod=60 -DusersPerSec=0.3`

#### Running `OIDCRegisterAndLogoutSimulation`

Expand All @@ -240,7 +243,7 @@ Running the user registration simulation requires a different approach to datase
`mvn verify -P test -D test.properties=oidc-register-logout -DsequentialUsersFrom=0 -DusersPerRealm=<MAX_EXPECTED_REGISTRATIONS>`

##### Example B:
1. Generate or import dataset with 100 users: `mvn verify -P generate-data -Ddataset=100u2c`. This will create 1 realm and users 0-99.
1. Generate or import dataset with 100 users: `mvn verify -P generate-data -Ddataset=1r_10c_100u`. This will create 1 realm and users 0-99.
2. Run the registration test starting from user 100:

`mvn verify -P test -D test.properties=oidc-register-logout -DsequentialUsersFrom=100 -DusersPerRealm=<MAX_EXPECTED_REGISTRATIONS>`
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Loading

0 comments on commit ea8eaaf

Please sign in to comment.