Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TiKV store status information #6949

Merged
merged 12 commits into from
Mar 31, 2022
4 changes: 4 additions & 0 deletions faq/deploy-and-maintain-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,10 @@ PD can tolerate any synchronization error, but a larger error value means a larg

The client connection can only access the cluster through TiDB. TiDB connects PD and TiKV. PD and TiKV are transparent to the client. When TiDB connects to any PD, the PD tells TiDB who is the current leader. If this PD is not the leader, TiDB reconnects to the leader PD.

#### What is the relationship between each status (Up, Disconnect, Offline, Down, Tombstone) of a TiKV store?

You can use PD Control to check the status information of a TiKV store. To see the relationship between each status, refer to [Relationship between each status of a TiKV store](/tidb-scheduling.md#information-collection).
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved

#### What is the difference between the `leader-schedule-limit` and `region-schedule-limit` scheduling parameters in PD?

- The `leader-schedule-limit` scheduling parameter is used to balance the Leader number of different TiKV servers, affecting the load of query processing.
Expand Down
Binary file added media/tikv-store-status-relationship.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion pd-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -840,7 +840,8 @@ Usage:

> **Note:**
>
> When you use the `store limit` command, the original `region-add` and `region-remove` are deprecated. Use `add-peer` and `remove-peer` instead.
> - When you use the `store limit` command, the original `region-add` and `region-remove` are deprecated. Use `add-peer` and `remove-peer` instead.
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved
> - You can use PD Control to check the status information (Up, Disconnect, Offline, Down, or Tombstone) of a TiKV store. To see the relationship between each status, refer to [Relationship between each status of a TiKV store](/tidb-scheduling.md#information-collection).
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved

### `log [fatal | error | warn | info | debug]`

Expand Down
10 changes: 10 additions & 0 deletions tidb-scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,16 @@ Scheduling is based on information collection. In short, the PD scheduling compo
* Whether the store is overloaded
* Labels (See [Perception of Topology](/schedule-replicas-by-topology-labels.md))

You can use PD control to check the status information of a TiKV store, which is divided into Up, Disconnect, Offline, Down, and Tombstone. The relationship between each status is as follows:
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved

+ **Up**: The current TiKV store is providing service.
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved
+ **Disconnect**: When the heartbeat information of PD and the TiKV store is lost for more than 20 seconds, the store status changes to Disconnect. When the lost time exceeds the time specified by `max-store-down-time`, the store status changes to Down.
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved
+ **Down**: When the time that the TiKV store lost connection with the cluster has exceeded the time specified by `max-store-down-time` (30 minutes by default), the store changes to this status. This status indicates that the corresponding store status changes to Down and starts replenishing peers of each Region on the surviving store.
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved
+ **Offline**: When a TiKV Store is manually taken offline through PD Control, the store status becomes Offline. This is only an intermediate status when the store is taking offline. The store in this status performs leader transfer operation and Region balance operation. When the `leader_count/region_count` (get by PD Control) both show that the transfer or balance operation is completed, the store status changes from Offline to Tombstone. In the Offline status, TiKV shuts down the store service and the physical server where the store is located.
en-jin19 marked this conversation as resolved.
Show resolved Hide resolved
+ **Tombstone**: This status indicates that the TiKV store is completely offline. You can use `remove-tombstone` interface to safely clean up TiKV in this status.

![TiKV store status relationship](/media/tikv-store-status-relationship.png)

- Information reported by Region leaders:

Each Region leader sends heartbeats to PD periodically to report [`RegionState`](https://github.com/pingcap/kvproto/blob/master/proto/pdpb.proto#L312), including:
Expand Down