Skip to content

Commit

Permalink
Add multiple links to index and fix typos (#3572)
Browse files Browse the repository at this point in the history
* Add multiple links to index and fix typos

* toc: update capitalization

* Update a link
  • Loading branch information
lilin90 authored Aug 6, 2020
1 parent 6f54f84 commit bc4cfb5
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 21 deletions.
2 changes: 1 addition & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@
+ [Enable TLS Between TiDB Clients and Servers](/enable-tls-between-clients-and-servers.md)
+ [Enable TLS Between TiDB Components](/enable-tls-between-components.md)
+ [Generate Self-signed Certificates](/generate-self-signed-certificates.md)
+ [Encryption-At-Rest](/encryption-at-rest.md)
+ [Encryption at Rest](/encryption-at-rest.md)
+ Privileges
+ [Security Compatibility with MySQL](/security-compatibility-with-mysql.md)
+ [Privilege Management](/privilege-management.md)
Expand Down
30 changes: 15 additions & 15 deletions encryption-at-rest.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,28 @@
---
title: Encryption-At-Rest for TiKV
summary: Learn how to enable encryption-at-rest to protect sensitive data.
aliases: ['/docs/dev/encryption-at-rest/']
title: Encryption at Rest for TiKV
summary: Learn how to enable encryption at rest to protect sensitive data.
aliases: ['/docs/dev/encryption at rest/']
---

# Encryption-At-Rest for TiKV <span class="version-mark">New in v4.0.0</span>
# Encryption at Rest for TiKV <span class="version-mark">New in v4.0.0</span>

Encryption-at-rest means that data is encrypted when it is stored. For databases, this feature is also referred to as TDE (transparent data encryption). This is opposed to encryption in flight (TLS) or encryption in use (rarely used). Different things could be doing encryption-at-rest (SSD drive, file system, cloud vendor, etc), but by having TiKV do the encryption before storage this helps ensure that attackers must authenticate with the database to gain access to data. For example, when an attacker gains access to the physical machine, data cannot be accessed by copying files on disk.
Encryption at rest means that data is encrypted when it is stored. For databases, this feature is also referred to as TDE (transparent data encryption). This is opposed to encryption in flight (TLS) or encryption in use (rarely used). Different things could be doing encryption at rest (SSD drive, file system, cloud vendor, etc), but by having TiKV do the encryption before storage this helps ensure that attackers must authenticate with the database to gain access to data. For example, when an attacker gains access to the physical machine, data cannot be accessed by copying files on disk.

TiKV supports encryption-at-rest starting from v4.0.0. The feature allows TiKV to transparently encrypt data files using [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) in [CTR](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) mode. To enable encryption-at-rest, an encryption key must be provided by user and this key is called master key. The master key can be provided via AWS KMS (recommended), or specifying a key stored as plaintext in a file. TiKV automatically rotates data keys that it used to encrypt actual data files. Manually rotating the master key can be done occassionally. Note that encryption-at-rest only encrypts data at rest (i.e. on disk) and not while data is transferred over network. It is advised to use TLS together with encryption-at-rest.
TiKV supports encryption at rest starting from v4.0.0. The feature allows TiKV to transparently encrypt data files using [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) in [CTR](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) mode. To enable encryption at rest, an encryption key must be provided by user and this key is called master key. The master key can be provided via AWS KMS (recommended), or specifying a key stored as plaintext in a file. TiKV automatically rotates data keys that it used to encrypt actual data files. Manually rotating the master key can be done occasionally. Note that encryption at rest only encrypts data at rest (i.e. on disk) and not while data is transferred over network. It is advised to use TLS together with encryption at rest.

Also from v4.0.0, BR supports S3 server-side encryption (SSE) when backing up to S3. A customer owned AWS KMS key can also be used together with S3 server-side encrytion.
Also from v4.0.0, BR supports S3 server-side encryption (SSE) when backing up to S3. A customer owned AWS KMS key can also be used together with S3 server-side encryption.

## Caveats

The current version of TiKV encryption has some drawbacks that we are working actively to address in the future versions.

* When a TiDB cluster is deployed, the majority of user data is stored in TiKV nodes, and that data will be encrypted when encryption is enabled. However, a small amount of user data is stored in PD nodes as metadata (for example, secondary index keys used as TiKV region boundaries). As of v4.0.0, PD doesn't support encryption-at-rest. It is recommended to use storage-level encryption (for example, file system encryption) to help protect sensitive data stored in PD.
* As of v4.0.0, TiFlash doesn't support encryption-at-rest. When deploying TiKV with TiFlash, data stored in TiFlash is not encrypted.
* TiKV currently does not exclude encryption keys and user data from core dumps. It is advised to disable core dumps for the TiKV process when using encryption-at-rest. This is not currently handled by TiKV itself.
* When a TiDB cluster is deployed, the majority of user data is stored in TiKV nodes, and that data will be encrypted when encryption is enabled. However, a small amount of user data is stored in PD nodes as metadata (for example, secondary index keys used as TiKV region boundaries). As of v4.0.0, PD doesn't support encryption at rest. It is recommended to use storage-level encryption (for example, file system encryption) to help protect sensitive data stored in PD.
* As of v4.0.0, TiFlash doesn't support encryption at rest. When deploying TiKV with TiFlash, data stored in TiFlash is not encrypted.
* TiKV currently does not exclude encryption keys and user data from core dumps. It is advised to disable core dumps for the TiKV process when using encryption at rest. This is not currently handled by TiKV itself.
* TiKV tracks encrypted data files using the absolute path of the files. As a result, once encryption is turned on for a TiKV node, the user should not change data file paths config such as `storage.data-dir`, `raftstore.raftdb-path`, `rocksdb.wal-dir` and `raftdb.wal-dir`.
* TiKV info log contains user data for debugging purposes. The info log and this data in it are not encrypted.

## TiKV Encryption-At-Rest
## TiKV encryption at rest

### Overview

Expand Down Expand Up @@ -99,15 +99,15 @@ region = "us-west-2"

### Monitoring and Debugging

To monitor encryption-at-rest, if you deploy TiKV with Grafana, you can look at the **Encryption** panel in the **TiKV-Details** dashboard. There are a few metrics to look for:
To monitor encryption at rest, if you deploy TiKV with Grafana, you can look at the **Encryption** panel in the **TiKV-Details** dashboard. There are a few metrics to look for:

* Encryption initialized: 1 if encryption is initialized during TiKV startup, 0 otherwise. In case of master key rotation, after encryption is initialized, TiKV do not need access to the previous master key.
* Encryption data keys: number of existings data keys. The number is bumped by 1 after each time data key rotation happened. Use this metrics to monitor if data key rotation works as expected.
* Encrypted files: number of encrypted data files currently exists. Compare this number to existings data files in the data directory to estimate portion of data being encrypted, when turning on encryption for a previously unencrypted cluster.
* Encryption data keys: number of existing data keys. The number is bumped by 1 after each time data key rotation happened. Use this metrics to monitor if data key rotation works as expected.
* Encrypted files: number of encrypted data files currently exists. Compare this number to existing data files in the data directory to estimate portion of data being encrypted, when turning on encryption for a previously unencrypted cluster.
* Encryption meta file size: size of the encryption meta data files.
* Read/Write encryption meta duration: the extra overhead to operate on metadata for encryption.

For debugging, the `tikv-ctl` command can be used to dump encryption metadata such as encryption method and data key id used to encryption the file, as well as list of data keys. Since the operation can expose senstive data, it is not recommended to use in production. Please refer to [TiKV Control](/tikv-control.md#dump-encryption-metadata] document.
For debugging, the `tikv-ctl` command can be used to dump encryption metadata such as encryption method and data key id used to encryption the file, as well as list of data keys. Since the operation can expose sensitive data, it is not recommended to use in production. Please refer to [TiKV Control](/tikv-control.md#dump-encryption-metadata] document.

## BR S3 server-side encryption

Expand Down
11 changes: 6 additions & 5 deletions whats-new-in-tidb-4.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ TiDB v4.0 was officially released on May 28th, 2020. In this release, we have ma

+ Cascading Placement Rules is an experimental feature of the Placement Driver (PD) introduced in v4.0. It is a replica rule system that guides PD to generate corresponding schedules for different types of data. By combining different scheduling rules, you can finely control the attributes of any continuous data range, such as the number of replicas, the storage location, the host type, whether to participate in Raft election, and whether to act as the Raft leader. See [Cascading Placement Rules](/configure-placement-rules.md) for details.
+ Elastic scheduling is an experimental feature based on Kubernetes, which enables TiDB to dynamically scale in and out nodes. This feature can effectively mitigate the high workload during peak hours of an application and saves unnecessary overhead.
+ Hotspot scheduling policy supports more dimensions. In addition to using write or read traffic as the scheduling basis, keys are introduced as a new dimension for the scheduling policy, which might, to a large extent, mitigate the CPU usage imbalance caused by the previous single-dimensional policy.
+ Hotspot scheduling policy supports more dimensions. In addition to using write or read traffic as the scheduling basis, keys are introduced as a new dimension for the scheduling policy, which might, to a large extent, mitigate the CPU usage imbalance caused by the previous single-dimensional policy. See [TiDB Scheduling](/tidb-scheduling.md) for details.

### Storage engine

Expand Down Expand Up @@ -45,7 +45,7 @@ TiUP is a new package manager tool introduced in v4.0 that is used to manage all
### Transaction

- The pessimistic transaction is now provided for general availability as the default transaction mode. Support the Read Committed isolation level and the `SELECT FOR UPDATE NOWAIT` syntax. See [Pessimistic Transaction Model](/pessimistic-transaction.md) for details.
- Support large transactions. Increase the upper limit on transaction size from 10 MB to 10 GB. Support both the pessimistic transaction and optimistic transaction.
- Support large transactions. Increase the upper limit on transaction size from 10 MB to 10 GB. Support both the pessimistic transaction and optimistic transaction. See [Transaction size limit](/transaction-overview.md#transaction-size-limit) for details.

### SQL features

Expand All @@ -59,7 +59,7 @@ TiUP is a new package manager tool introduced in v4.0 that is used to manage all
- Support using the Index Merge feature to access tables. When you make a query on a single table, the TiDB optimizer automatically reads multiple index data according to the query condition and makes a union of the result, which improves the performance of querying on a single table. See [Index Merge](/query-execution-plan.md#indexmerge-example) for details.
- Support the expression index feature (**experimental**). The expression index is also called the function-based index. When you create an index, the index fields do not have to be a specific column but can be an expression calculated from one or more columns. This feature is useful for quickly accessing the calculation-based tables. See [Expression index](/sql-statements/sql-statement-create-index.md) for details.
- Support `AUTO_RANDOM` keys as an extended syntax for the TiDB columnar attribute (**experimental**). `AUTO_RANDOM` is designed to address the hotspot issue caused by the auto-increment column and provides a low-cost migration solution from MySQL for users who work with auto-increment columns. See [`AUTO_RANDOM` Key](/auto-random.md) for details.
- Add system tables that provide information of cluster topology, configuration, logs, hardware, operating systems, and slow queries, which helps DBAs to quickly learn, analyze system metrics. See [SQL Diagnosis](/information-schema/information-schema-sql-diagnostics.md) for details.
- Add system tables that provide information of cluster topology, configuration, logs, hardware, operating systems, and slow queries, which helps DBAs to quickly learn, analyze system metrics. See [Information Schema](/information-schema/information-schema.md) and [SQL Diagnosis](/information-schema/information-schema-sql-diagnostics.md) for details.

- Add system tables that provide information of cluster topology, configuration, logs, hardware, operating systems to help DBAs quickly learn the cluster configuration and status:
- The `cluster_info` table that stores the cluster topology information.
Expand All @@ -78,8 +78,9 @@ Support case-insensitive and accent-insensitive `utf8mb4_general_ci` and `utf8_g

### Security

+ Improve the encrypted communication between the client and server, and between components, which ensures data security and prevents any sent and received data from being read and modified by illegal hackers. Mainly support the certificate-based login authentication, updating certificate online, and verifying the CommonName attribute of the TLS certificate.
+ Transparent Data Encryption (TDE) is a new feature that provides protection for the entire database. This feature, when enabled, is transparent to applications that are connected to TiDB and does not require any change to the existing applications. Because this TDE feature operates at the file level, TiDB encrypts data before writing data to disk, and decrypts data before reading data from memory to ensure data security. Currently, the AES128-CTR, AES192-CTR, and AES256-CTR encryption algorithms are supported. You can manage keys via AWS KMS.
+ Improve the encrypted communication between the client and server, and between components, which ensures data security and prevents any sent and received data from being read and modified by illegal hackers. Mainly support the certificate-based login authentication, updating certificate online, and verifying the CommonName attribute of the TLS certificate. See [Enable TLS Between TiDB Clients and Servers](/enable-tls-between-clients-and-servers.md) for details.

+ Transparent Data Encryption (TDE) is a new feature that provides protection for the entire database. This feature, when enabled, is transparent to applications that are connected to TiDB and does not require any change to the existing applications. Because this TDE feature operates at the file level, TiDB encrypts data before writing data to disk, and decrypts data before reading data from memory to ensure data security. Currently, the AES128-CTR, AES192-CTR, and AES256-CTR encryption algorithms are supported. You can manage keys via AWS KMS. See [Encryption at Rest](/encryption-at-rest.md) for details.

### Backup and Restore

Expand Down

0 comments on commit bc4cfb5

Please sign in to comment.