Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation #135

Merged
merged 3 commits into from
Feb 6, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
300 changes: 175 additions & 125 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,182 +2,232 @@

This library allows you to safely test your migration from one back-end to another in production!

The Shadow Tool can be easily integrated in your Java/Kotlin project and allows you to compare the current back-end
The Shadow Tool can be easily integrated into your Java/Kotlin project and allows you to compare the current back-end
service your application is using against the new back-end you plan on using.
Since it actually runs on production (in the background), it gives you trust in:
Since it actually runs in the production environment (in the background), it helps ensure that:

1. the connection towards your new back-end,
2. the data quality coming from the new back-end,
3. whether your code correctly maps the data of the new back-end to your existing domain.
erwinc1 marked this conversation as resolved.
Show resolved Hide resolved

The tool is designed to be a plug-and-play solution which runs without functional impact in your current production app.
When activated, when your app fetches data from your current back-end it will additionally call the new back-end and
compare the data in parallel.
This will be sampled based on a configured percentage as to not overload your application.
The tool is designed to be a plug-and-play solution that runs without impacting the functionality of your current production app.
When activated, as your app fetches data from your current back-end, it will also call the new back-end and compare
the data in parallel.
This will be sampled based on a configured percentage to prevent overloading your application.
The findings are reported using log statements.

## Getting started

1. In order to see the differences, the library expects the `slf4j-api` library to be provided by the using application.
2. Optional: To be able to inspect the values of the differences, it is required to set up encryption. Not setting up
encryption allows you to see the different keys only, so no values.
To begin, an RSA (at least) 2048 bit public and private key are required. Generate as follows (for both the public
and private key):
```bash
openssl genrsa -out pair.pem 2048 && openssl rsa -in pair.pem -pubout -out public.key && openssl pkcs8 -topk8 -inform PEM -outform PEM -nocrypt -in pair.pem -out private.key && rm -rf pair.pem
```
Keep the private key secret. When the data is sensitive, nobody other than you should be able to inspect these
values.
To create a `java.security.PublicKey`, you can use below code (add dependency `org.bouncycastle:bcprov-jdk15on`):
```java
import java.io.File;
import java.io.StringReader;
import java.nio.file.Files;
import java.security.KeyFactory;
import java.security.PublicKey;
import java.security.spec.X509EncodedKeySpec;
import java.util.Objects;
import org.bouncycastle.util.io.pem.PemReader;

private static PublicKey publicKey() throws Exception {
final var publicKeyFile = new File(Objects.requireNonNull(EncryptionServiceTest.class.getClassLoader().getResource("public.key")).getFile());
final var reader = new StringReader(Files.readString(publicKeyFile.toPath()));
final var pemReader = new PemReader(reader);
final var factory = KeyFactory.getInstance("RSA");
final var pemObject = pemReader.readPemObject();
final var keyContentAsBytesFromBC = pemObject.getContent();
final var pubKeySpec = new X509EncodedKeySpec(keyContentAsBytesFromBC);
return factory.generatePublic(pubKeySpec);
}
```

## Installation

The Shadow Tool is released to Maven Central, where you can find its latest version.

### Maven

[![Maven Central](https://maven-badges.herokuapp.com/maven-central/io.github.rabobank/shadow-tool/badge.svg)](https://maven-badges.herokuapp.com/maven-central/io.github.rabobank/shadow-tool)

```xml
<dependency>
<groupId>io.github.rabobank</groupId>
<artifactId>shadow-tool</artifactId>
<version>${shadow-tool.version}</version>
<version>1.4.5</version>
erwinc1 marked this conversation as resolved.
Show resolved Hide resolved
</dependency>
```

### Gradle

```kotlin
implementation("io.github.rabobank:shadow-tool:$version")
implementation("io.github.rabobank:shadow-tool:1.4.5")
```

The Shadow Tool can be easily integrated in your Java/Kotlin project and allows you to compare the current back-end service your application is using against the new back-end you plan on using.
Since it actually runs on production (in the background), it gives you trust in:
1. the connection towards your new back-end,
2. the data quality coming from the new back-end,
3. whether your code correctly maps the data of the new back-end to your existing domain.

The tool is designed to be a plug-and-play solution which runs without functional impact in your current production app.
When activated, when your app fetches data from your current back-end it will additionally call the new back-end and compare the data in parallel.
This will be sampled based on a configured percentage as to not overload your application.
The findings are reported using log statements.

## Getting started
1. Build the library locally and add it as a dependency to your project (**We are still working on deploying this to Maven Central**)
2. In order to see the differences, the library expects the `slf4j-api` library to be provided by the using application.
3. Optional: To be able to inspect the values of the differences, it is required to set up encryption. Not setting up encryption allows you to see the different keys only, so no values.
To begin, an RSA 2048 bit public and private key are required. Generate as follows (for both the public and private key):

1. **Important:** In order to see the differences in your logs, you have to add `slf4j-api` to your dependencies. By
default, only fieldnames (keys) are logged when the values differ.
To see the what exactly is different, encryption is required. Proceed to step 2 for setting up encryption.
2. You have 3 encryption options:
1. **Noop encryption**: By setting up a `NoopEncryptionService`, the differences are logged as `Base64` encoded
text. This is not recommended for sensitive data.
Example:
```java
import io.github.rabobank.shadow_tool.ShadowFlow.ShadowFlowBuilder;

import java.util.List;
import java.util.function.Supplier;

public class BackendService {

public DummyObject callBackend() {
// Create a ShadowFlow instance with NoopEncryptionService
// The 10 means that for 10% of all requests, the `newBackend` is invoked as well and its response is compared against the `currentBackend` response.
ShadowFlowBuilder<Dummy> builder = new ShadowFlowBuilder<>(10);
ShadowFlow<Dummy> shadowFlow = builder.withEncryptionService(NoopEncryptionService.INSTANCE).build();

// Define your current backend and new backend suppliers
Supplier<Dummy> currentBackend = () -> {
// Your current backend logic here
return new Dummy("Bob", "Utrecht", List.of("Mirabel", "Bruno"));
};

Supplier<Dummy> newBackend = () -> {
// Your new backend logic here
return new Dummy("Bob", "Amsterdam", List.of("Bruno", "Mirabel", "Mirabel"));
};

// The result is always from the first supplier. So in this case, the return value always yields the response of the `currentBackend` service.
return shadowFlow.compare(currentBackend, newBackend);
}
}
```
2. **Cipher encryption**: The differences are logged as encrypted values. This is recommended for sensitive
data.
Example:
```java
import io.github.rabobank.shadow_tool.ShadowFlow.ShadowFlowBuilder;

import javax.crypto.Cipher;
import java.security.GeneralSecurityException;
import java.util.List;
import java.util.function.Supplier;

public class BackendService {

public DummyObject callBackend() {
// Create a Cipher instance
Cipher cipher = null;
try {
// The AES key (16, 24, or 32 bytes)
final var keyBytes = Hex.decodeStrict("3d7e0c4f8fbbd8d8a79e76cabc8f4e24");
final var secretKey = new SecretKeySpec(keyBytes, ALGORITHM);

// Initialization Vector (IV) for GCM
final var iv = Hex.decodeStrict("3d7e0c4f8fbb"); // 96 bits IV
if (iv.length != GCM_SIV_IV_SIZE) {
throw new IllegalArgumentException("Initialization Vector should be 12 bytes / 96 bits");
}

// Create AEADParameterSpec
final var gcmParameterSpec = new GCMParameterSpec(MAC_SIZE_IN_BITS, iv);
// Create Cipher instance with the specified algorithm and provider
cipher = Cipher.getInstance(ALGORITHM_MODE);

// Initialize the Cipher for encryption or decryption
cipher.init(ENCRYPT_MODE, secretKey, gcmParameterSpec);
} catch (GeneralSecurityException e) {
// Handle exception
}

// Create a ShadowFlow instance with DefaultEncryptionService
// The 10 means that for 10% of all requests, the `newBackend` is invoked as well and its response is compared against the `currentBackend` response.
ShadowFlow<Dummy> shadowFlow = new ShadowFlowBuilder<Dummy>(10).withCipher(cipher).build();

// Define your current backend and new backend suppliers
Supplier<Dummy> currentBackend = () -> {
// Your current backend logic here
return new Dummy("Bob", "Utrecht", List.of("Mirabel", "Bruno"));
};

Supplier<Dummy> newBackend = () -> {
// Your new backend logic here
return new Dummy("Bob", "Amsterdam", List.of("Bruno", "Mirabel", "Mirabel"));
};

// The result is always from the first supplier. So in this case, the return value always yields the response of the `currentBackend` service.
return shadowFlow.compare(currentBackend, newBackend);
}
}
```
3. **PublicKey encryption**: The differences are logged as encrypted values. This is recommended for sensitive data.
Example:
```java
import io.github.rabobank.shadow_tool.ShadowFlow.ShadowFlowBuilder;

import java.security.KeyFactory;
import java.security.PublicKey;
import java.security.spec.X509EncodedKeySpec;
import java.util.Base64;
import java.util.List;
import java.util.function.Supplier;

public class BackendService {

public DummyObject callBackend() {
final PublicKey publicKey;
try {
publicKey = KeyFactory.getInstance("RSA")
.generatePublic(new X509EncodedKeySpec(Base64.decode("MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArmkP2CgDn3OsuIj1GxM3")));
} catch (InvalidKeySpecException | NoSuchAlgorithmException e) {
throw new RuntimeException(e);
}

// Create a ShadowFlow instance with PublicKeyEncryptionService
// The 10 means that for 10% of all requests, the `newBackend` is invoked as well and its response is compared against the `currentBackend` response.
ShadowFlowBuilder<Dummy> builder = new ShadowFlowBuilder<>(10);
builder.withEncryption(publicKey);

ShadowFlow<Dummy> shadowFlow = builder.build();

// Define your current backend and new backend suppliers
Supplier<Dummy> currentBackend = () -> {
// Your current backend logic here
return new Dummy("Bob", "Utrecht", List.of("Mirabel", "Bruno"));
};

Supplier<Dummy> newBackend = () -> {
// Your new backend logic here
return new Dummy("Bob", "Amsterdam", List.of("Bruno", "Mirabel", "Mirabel"));
};
// The result is always from the first supplier. So in this case, the return value always yields the response of the `currentBackend` service.
return shadowFlow.compare(currentBackend, newBackend);
}
}
```
3. To create a public and private (to decrypt) key, run the following command:
```bash
openssl genrsa -out pair.pem 2048 && openssl rsa -in pair.pem -pubout -out public.key && openssl pkcs8 -topk8 -inform PEM -outform PEM -nocrypt -in pair.pem -out private.key && rm -rf pair.pem
```
Keep the private key secret. When the data is sensitive, nobody other than you should be able to inspect these values.
To create a `java.security.PublicKey`, you can use below code (add dependency `org.bouncycastle:bcprov-jdk15on`):
```java
import java.io.File;
import java.io.StringReader;
import java.nio.file.Files;
import java.security.KeyFactory;
import java.security.PublicKey;
import java.security.spec.X509EncodedKeySpec;
import java.util.Objects;
import org.bouncycastle.util.io.pem.PemReader;

private static PublicKey publicKey() throws Exception {
final var publicKeyFile = new File(Objects.requireNonNull(EncryptionServiceTest.class.getClassLoader().getResource("public.key")).getFile());
final var reader = new StringReader(Files.readString(publicKeyFile.toPath()));
final var pemReader = new PemReader(reader);
final var factory = KeyFactory.getInstance("RSA");
final var pemObject = pemReader.readPemObject();
final var keyContentAsBytesFromBC = pemObject.getContent();
final var pubKeySpec = new X509EncodedKeySpec(keyContentAsBytesFromBC);
return factory.generatePublic(pubKeySpec);
}
```
## How to use?

```java
ShadowFlow<AccountInfo> shadowFlow = new ShadowFlowBuilder<AccountInfo>(10)
.withInstanceName("account-service") # Optional. Default value is 'default'
.withEncryption(<java.security.PublicKey>) # Optional. See configuration above for generating these secrets.
.build();

AccountInfo result = shadowFlow.compare(
() -> yourCurrentBackend.getAccountInfo(),
() -> yourNewBackend.getAccountInfo()
);
```

The result is always from the first supplier. So in this case, `result` always yields the response of
the `yourCurrentBackend` service.

The 10 means that for 10% of all requests, the `yourNewBackend` is invoked as well and its response is compared against
the `yourCurrentBackend` response.
This happens asynchronously, so it will not have impact on the main flow performance-wise.
The shadow tool invokes both services asynchronously, so it will not have impact on the main flow performance-wise.
Be aware that the more often the Shadow Tool runs, the more resources your application uses and back-ends are called.
Take care to not set this number too high for high-traffic applications.
Be careful not to set this number too high for high-traffic applications.

To be able to compare apples with apples, both services are required to return the same domain classes.
In the example above, we called it `AccountInfo`.
Also, since the secondary call is already mapped to the correct domain, it is super easy to finish the migration: just
replace the first call with the secondary call and remove the Shadow Tool code.
For a fair comparison, both services are required to return the same domain classes.
In the example above, we called it `Dummy`.
Also, since the secondary call is already mapped to the correct domain, completing the migration is straightforward:
simply replace the first call with the secondary call and remove the Shadow Tool code.

You are able to distinguish the results of multiple shadow flows running in your application by setting an instance
name.
You can distinguish the results of multiple shadow flows running in your application by setting an instance name.
This will be part of the log messages.

#### Reactive

The Shadow Tool also provides a reactive API based on Project Reactor.

```java
class AccountInfoService {
class MyService {
// fields and constructor

public Mono<AccountInfo> getAccountInfo() {
public Mono<Dummy> getDummy() {
return shadowFlow.compare(
getAccountInfoFromCurrent(),
getAccountInfoFromNew()
getDummyFromCurrent(),
getDummyFromNew()
);
}

private Mono<AccountInfo> getAccountInfoFromCurrent() {
return yourCurrentBackend.getAccountInfoMono()
private Mono<Dummy> getDummyFromCurrent() {
return yourCurrentBackend.getDummMono()
.map(...);
}

private Mono<AccountInfo> getAccountInfoFromNew() {
return yourNewBackend.getAccountInfoMono()
private Mono<AccountInfo> getDummyFromNew() {
return yourNewBackend.getDummyMono()
.map(...);
}
}
```

## Logs

The Shadow Tool logs whenever it finds differences between the two flows.
It will always log the field names of the objects containing the differences, and it can also log the values when
encryption is set up.
Something like the following can be expected:
The Shadow Tool logs any differences it finds between the two flows.
It always logs the field names of the objects containing the differences,
and it can also log the values when encryption is set up.
You can expect output similar to the following:

```
# Without Encryption enabled
Expand All @@ -188,10 +238,10 @@ The following differences were found: firstName, lastName. Encrypted values: 6U8
```

## Inspecting the values of differences
Values are encrypted using the public key which is set up during the configuration.
The algorithm used is RSA with Electronic Codeblock mode (CBC) and `OAEPWITHSHA-256ANDMGF1PADDING` padding.
You can create a runnable jar with the following code to decrypt the values. Continuing the example above (explaining how to enable encrypting data):

Values are encrypted using the public key that is set up during the configuration.
The default algorithm for Public Key encryption is RSA with Electronic Codeblock mode (CBC) and `OAEPWITHSHA-256ANDMGF1PADDING` padding.

### Example of decrypting values of differences

An example can be found in one of the tests: [EncryptionServiceTest](src/test/java/io/github/rabobank/shadow_tool/EncryptionServiceTest.java).
You can find an example in one of the tests: [EncryptionServiceTest](src/test/java/io/github/rabobank/shadow_tool/EncryptionServiceTest.java).