Skip to content

Commit

Permalink
encryption-at-rest: Update
Browse files Browse the repository at this point in the history
  • Loading branch information
dveeden committed Aug 13, 2021
1 parent aed94a2 commit 1818ca2
Showing 1 changed file with 34 additions and 11 deletions.
45 changes: 34 additions & 11 deletions encryption-at-rest.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,28 @@ summary: Learn how to enable encryption at rest to protect sensitive data.
aliases: ['/docs/dev/encryption at rest/']
---

# Encryption at Rest <span class="version-mark">New in v4.0.0</span>
# Encryption at Rest

> **Warning:**
>
> Using encryption-at-rest for PD is an experimental feature.
Encryption at rest means that data is encrypted when it is stored. For databases, this feature is also referred to as TDE (transparent data encryption). This is opposed to encryption in flight (TLS) or encryption in use (rarely used). Different things could be doing encryption at rest (SSD drive, file system, cloud vendor, etc), but by having TiKV do the encryption before storage this helps ensure that attackers must authenticate with the database to gain access to data. For example, when an attacker gains access to the physical machine, data cannot be accessed by copying files on disk.

TiKV supports encryption at rest starting from v4.0.0. The feature allows TiKV to transparently encrypt data files using [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) in [CTR](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) mode. To enable encryption at rest, an encryption key must be provided by user and this key is called master key. The master key can be provided via AWS KMS (recommended), or specifying a key stored as plaintext in a file. TiKV automatically rotates data keys that it used to encrypt actual data files. Manually rotating the master key can be done occasionally. Note that encryption at rest only encrypts data at rest (namely, on disk) and not while data is transferred over network. It is advised to use TLS together with encryption at rest.
TiKV supports encryption at rest. The feature allows TiKV to transparently encrypt data files using [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) in [CTR](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) mode. To enable encryption at rest, an encryption key must be provided by user and this key is called master key. The master key can be provided via AWS KMS (recommended), or specifying a key stored as plaintext in a file. TiKV automatically rotates data keys that it used to encrypt actual data files. Manually rotating the master key can be done occasionally. Note that encryption at rest only encrypts data at rest (namely, on disk) and not while data is transferred over network. It is advised to use TLS together with encryption at rest.

TiFlash also supports encryption at rest and while it only stores some metadata PD also supports data at rest. Encryption at rest can be enabled per component.

Using AWS KMS is also possible for on-premise deployments.

Also from v4.0.0, BR supports S3 server-side encryption (SSE) when backing up to S3. A customer owned AWS KMS key can also be used together with S3 server-side encryption.
BR supports S3 server-side encryption (SSE) when backing up to S3. A customer owned AWS KMS key can also be used together with S3 server-side encryption.

## Warnings

The current version of TiKV encryption has the following drawbacks. Be aware of these drawbacks before you get started:

* When a TiDB cluster is deployed, the majority of user data is stored in TiKV nodes, and that data will be encrypted when encryption is enabled. However, a small amount of user data is stored in PD nodes as metadata (for example, secondary index keys used as TiKV region boundaries). As of v4.0.0, PD doesn't support encryption at rest. It is recommended to use storage-level encryption (for example, file system encryption) to help protect sensitive data stored in PD.
* TiFlash supports encryption at rest since v4.0.5. For details, refer to [Encryption at Rest for TiFlash](#encryption-at-rest-for-tiflash-new-in-v405). When deploying TiKV with TiFlash earlier than v4.0.5, data stored in TiFlash is not encrypted.
* When a TiDB cluster is deployed, the majority of user data is stored in TiKV and TiFlash nodes, and that data will be encrypted when encryption is enabled. However, a small amount of user data is stored in PD nodes as metadata (for example, secondary index keys used as TiKV region boundaries). PD also supports encryption at rest.
* TiFlash supports encryption at rest, for details, refer to [Encryption at Rest for TiFlash](#encryption-at-rest-for-tiflash).
* TiKV currently does not exclude encryption keys and user data from core dumps. It is advised to disable core dumps for the TiKV process when using encryption at rest. This is not currently handled by TiKV itself.
* TiKV tracks encrypted data files using the absolute path of the files. As a result, once encryption is turned on for a TiKV node, the user should not change data file paths configuration such as `storage.data-dir`, `raftstore.raftdb-path`, `rocksdb.wal-dir` and `raftdb.wal-dir`.
* TiKV info log contains user data for debugging purposes. The info log and this data in it are not encrypted.
Expand All @@ -39,14 +47,27 @@ Data keys are generated by TiKV and passed to the underlying storage engine (nam

Regardless of data encryption method, data keys are encrypted using AES256 in GCM mode for additional authentication. This required the master key to be 256 bits (32 bytes), when passing from file instead of KMS.

### Key creation

Go to the [AWS KMS](https://console.aws.amazon.com/kms) on the AWS console. Make sure the correct region is selected on the top right corner of your console. Make and click "Create a key". Select "Symmetric" as Key type. After this you can set an alias an description and set tags.

It is also possible to do this with the AWS Cli:

```
aws --region us-west-2 kms create-key
aws --region us-west-2 kms create-alias --alias-name "alias/tidb-tde" --target-key-id 0987dcba-09fe-87dc-65ba-ab0987654321
```

The `key-id` for the second step is giving in the output of the first command.

### Configure encryption

To enable encryption, you can add the encryption section in TiKV's configuration file:
To enable encryption, you can add the encryption section in the configuration files of TiKV and PD:

```
[security.encryption]
data-encryption-method = aes128-ctr
data-key-rotation-period = 7d
data-encryption-method = "aes128-ctr"
data-key-rotation-period = "168h" # 7 days
```

Possible values for `data-encryption-method` are "aes128-ctr", "aes192-ctr", "aes256-ctr" and "plaintext". The default value is "plaintext", which means encryption is not turned on. `data-key-rotation-period` defines how often TiKV rotates the data key. Encryption can be turned on for a fresh TiKV cluster, or an existing TiKV cluster, though only data written after encryption is enabled is guaranteed to be encrypted. To disable encryption, remove `data-encryption-method` in the configuration file, or reset it to "plaintext", and restart TiKV. To change encryption method, update `data-encryption-method` in the configuration file and restart TiKV.
Expand All @@ -61,7 +82,9 @@ region = "us-west-2"
endpoint = "https://kms.us-west-2.amazonaws.com"
```

The `key-id` specifies the key id for the KMS CMK. The `region` is the AWS region name for the KMS CMK. The `endpoint` is optional and doesn't need to be specified normally, unless you are using a AWS KMS compatible service from a non-AWS vendor.
The `key-id` specifies the key id for the KMS CMK. The `region` is the AWS region name for the KMS CMK. The `endpoint` is optional and doesn't need to be specified normally, unless you are using a AWS KMS compatible service from a non-AWS vendor or need to use a [VPC endpoint for KMS](https://docs.aws.amazon.com/kms/latest/developerguide/kms-vpc-endpoint.html).

It is possible to use [multi-Region keys](https://docs.aws.amazon.com/kms/latest/developerguide/multi-region-keys-overview.html). For this you need to setup a primary key in a specific region and add replica keys in the regions you require.

To specify a master key that's stored in a file, the master key configuration would look like the following:

Expand Down Expand Up @@ -142,8 +165,8 @@ When restoring the backup, both `--s3.sse` and `--s3.sse-kms-key-id` should NOT
./br restore full --pd <pd-address> --storage "s3://<bucket>/<prefix> --s3.region <region>"
```

## Encryption at rest for TiFlash <span class="version-mark">New in v4.0.5</span>
## Encryption at rest for TiFlash

TiFlash supports encryption at rest since v4.0.5. Data keys are generated by TiFlash. All files (including data files, schema files, and temporary files) written into TiFlash (including TiFlash Proxy) are encrypted using the current data key. The encryption algorithms, the encryption configuration (in the `tiflash-learner.toml` file) supported by TiFlash, and the meanings of monitoring metrics are consistent with those of TiKV.
TiFlash supports encryption at rest. Data keys are generated by TiFlash. All files (including data files, schema files, and temporary files) written into TiFlash (including TiFlash Proxy) are encrypted using the current data key. The encryption algorithms, the encryption configuration (in the `tiflash-learner.toml` file) supported by TiFlash, and the meanings of monitoring metrics are consistent with those of TiKV.

If you have deployed TiFlash with Grafana, you can check the **TiFlash-Proxy-Details** -> **Encryption** panel.

0 comments on commit 1818ca2

Please sign in to comment.