Skip to content

Commit 008229f

Browse files
Ben Mansheimbmansheim
authored andcommitted
RED-38248 Slave HA changes for 5.6 (#708)
1 parent 3871f62 commit 008229f

File tree

1 file changed

+31
-32
lines changed
  • content/rs/administering/database-operations

1 file changed

+31
-32
lines changed

content/rs/administering/database-operations/slave-ha.md

Lines changed: 31 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -5,44 +5,42 @@ weight: $weight
55
alwaysopen: false
66
categories: ["RS"]
77
---
8-
When you enable [database replication]({{< relref "/rs/concepts/high-availability/replication.md" >}})
9-
for your database, RS replicates your data to a slave node to make sure that your
10-
data is highly available. Whether the slave node fails or the master node fails
11-
and the slave is promoted to master, the remaining master node is a
12-
single point of failure.
8+
When you enable [database replication]({{< relref "/rs/concepts/high-availability/replication.md" >}}) for your database,
9+
RS replicates your data to a slave node to make sure that your data is highly available.
10+
If the slave node fails or if the master node fails and the slave is promoted to master,
11+
the remaining master node is a single point of failure.
1312

14-
You can configure high availability for slave shards (slave HA) so that the cluster
15-
automatically migrates the slave shards to another available node. In practice, slave
16-
migration creates a new slave shard and replicates the data from the master shard to the
17-
new slave shard. For example:
13+
You can configure high availability for slave shards (slave HA) so that the cluster automatically migrates the slave shards to another available node.
14+
In practice, slave migration creates a new slave shard and replicates the data from the master shard to the new slave shard.
15+
For example:
1816

1917
1. Node:2 has a master shard and node:3 has the corresponding the slave shard.
2018
1. Either:
2119

2220
- Node:2 fails and the slave shard on node:3 is promoted to master.
23-
- Node:3 fails and the master shard is no longer replicated.
21+
- Node:3 fails and the master shard is no longer replicated to the slave shard on the failed node.
2422

25-
1. If slave HA is enabled, a new slave shard is created on an available node
26-
that does not hold the master shard.
23+
1. If slave HA is enabled, a new slave shard is created on an available node that does not hold the master shard.
2724

2825
All of the constraints of shard migration apply, such as [rack-awareness]({{< relref "/rs/concepts/high-availability/rack-zone-awareness.md" >}}).
2926

3027
1. The data from the master shard is replicated to the new slave shard.
3128

3229
## Configuring High Availability for Slave Shards
3330

34-
You can enable slave HA using rladmin or using the REST API either for:
31+
Using rladmin or the REST API, slave HA is controlled on the database level and on the cluster level.
32+
You can enable or disable slave HA for a database or for the entire cluster.
3533

36-
- Cluster - All databases in the cluster use slave HA
37-
- Database - Only the specified database uses slave HA
34+
When slave HA is enabled for both the cluster and a database,
35+
slave shards for that database are automatically migrated to another node in the event of a master or slave shard failure.
36+
If slave HA is disabled at the cluster level,
37+
slave HA will not migrate slave shards even if slave HA is enabled for a database.
3838

39-
By default, slave HA is set to disabled at the cluster level and enabled at the
40-
database level, with the cluster level overriding, so that:
39+
By default, slave HA is enabled for the cluster and disabled for each database so that o enable slave HA for a database, enable slave HA for that database.
4140

42-
- To enable slave HA for all databases in the cluster - Enable slave HA for the cluster
43-
- To enable slave HA for only specified databases in the cluster:
44-
1. Enable slave HA for the cluster
45-
1. Disable slave HA for the databases for which you do not want slave HA enabled
41+
{{% note %}}
42+
For Active-Active databases, slave HA is enabled for the database by default to make sure that slave shards are available for Active-Active replication.
43+
{{% /note %}}
4644

4745
To enable slave HA for a cluster using rladmin, run:
4846

@@ -58,22 +56,24 @@ You can see the current configuration options for slave HA with: `rladmin info c
5856

5957
### Grace Period
6058

61-
By default, slave HA has a 15-minute grace period after node failure and before new slave shards are created.
59+
By default, slave HA has a 10-minute grace period after node failure and before new slave shards are created.
6260
To configure this grace period from rladmin, run:
6361

6462
rladmin tune cluster slave_ha_grace_period <time_in_seconds>
6563

6664
### Shard Priority
6765

68-
Slave shard migration is based on priority so that, in the case of limited memory resources, the most important slave shards are migrated first. Slave HA migrates slave shards for databases according to this order of priority:
66+
Slave shard migration is based on priority so that, in the case of limited memory resources,
67+
the most important slave shards are migrated first.
68+
Slave HA migrates slave shards for databases according to this order of priority:
6969

7070
1. slave_ha_priority - The slave shards of the database with the higher slave_ha_priority
7171
integer value are migrated first.
7272

7373
To assign priority to a database, run:
7474

7575
```src
76-
rladmin tune db <bdb_uid> slave_ha_priority <positive integer>
76+
rladmin tune db <bdb_uid> slave_ha_priority <positive integer>
7777
```
7878
7979
1. CRDBs - The CRDB synchronization uses slave shards to synchronize between the replicas.
@@ -82,26 +82,25 @@ Slave shard migration is based on priority so that, in the case of limited memor
8282
8383
### Cooldown Periods
8484
85-
Both the cluster and the database have cooldown periods. After node failure, the cluster
86-
cooldown period prevents another slave migration due to another node failure for any
87-
databases in the cluster until the cooldown period ends (Default: 1 hour).
85+
Both the cluster and the database have cooldown periods.
86+
After node failure, the cluster cooldown period prevents another slave migration due to another node failure for any
87+
database in the cluster until the cooldown period ends (Default: 1 hour).
8888
89-
After a database is migrated with slave HA, it cannot go through another slave migration
90-
due to another node failure until the cooldown period for the database ends (Default: 24
91-
hours).
89+
After a database is migrated with slave HA,
90+
it cannot go through another slave migration due to another node failure until the cooldown period for the database ends (Default: 2 hours).
9291
9392
To configure this grace period from rladmin, run:
9493
9594
- For the cluster:
9695
9796
```src
98-
rladmin tune cluster slave_ha_cooldown_period <time_in_seconds>
97+
rladmin tune cluster slave_ha_cooldown_period <time_in_seconds>
9998
```
10099
101100
- For all databases in the cluster:
102101
103102
```src
104-
rladmin tune cluster slave_ha_bdb_cooldown_period <time_in_seconds>
103+
rladmin tune cluster slave_ha_bdb_cooldown_period <time_in_seconds>
105104
```
106105
107106
### Alerts

0 commit comments

Comments
 (0)