-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
14 changed files
with
1,428 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
--- | ||
layout: "docs" | ||
page_title: "Setting up Vault Enterprise Replication" | ||
sidebar_current: "docs-guides-replication" | ||
description: |- | ||
Learn how to set up and manage Vault Enterprise Replication. | ||
--- | ||
|
||
If you're unfamiliar with Vault Replication concepts, please first look at the | ||
[general information | ||
page](https://www.vaultproject.io/docs/vault-enterprise/replication/index.html). | ||
More details can be found in the [replication | ||
internals](https://www.vaultproject.io/docs/internals/replication.html) | ||
document. | ||
|
||
Also, note that full details of the API are available the endpoints relevant to | ||
the | ||
[primary](https://www.vaultproject.io/docs/http/sys-replication-primary.html) | ||
cluster, the | ||
[secondary](https://www.vaultproject.io/docs/http/sys-replication-secondary.html) | ||
cluster, and endpoints [relevant to | ||
both](https://www.vaultproject.io/docs/http/sys-replication.html). | ||
|
||
## Activating Replication | ||
|
||
### Activating the Primary | ||
|
||
To activate the primary, run `vault write -f sys/replication/primary/enable`. | ||
|
||
There is currently one optional argument: `primary_cluster_addr`. This can be | ||
used to override the cluster address that the primary advertises to the | ||
secondary, in case the internal network address/pathing is different between | ||
members of a single cluster and primary/secondary clusters. | ||
|
||
### Fetching a Secondary Token | ||
|
||
To fetch a secondary bootstrap token, run `vault write | ||
sys/replication/primary/secondary-token id=<id>`. | ||
|
||
The value for ID is opaque to Vault and can be any identifying value you want; | ||
this can be used later to revoke the secondary and will be listed when you read | ||
replication status on the primary. You will get back a normal wrapped response, | ||
except that the token will be a JWT instead of UUID-formatted random bytes. | ||
|
||
### Activating a Secondary | ||
|
||
To activate a secondary, run `vault write sys/replication/secondary/enable | ||
token=<token>`. | ||
|
||
You must provide the full token value. Be very careful when running this | ||
command, as it will destroy all data currently stored in the secondary. | ||
|
||
There are a few optional arguments, with the one you'll most likely need being | ||
`primary_api_addr`, which can be used to override the API address of the | ||
primary cluster; otherwise the secondary will use the value embedded in the | ||
bootstrap token, which is the primary’s redirect address. If the primary has no | ||
redirect address (for instance, if it's not in an HA cluster), you'll need to | ||
set this value at secondary enable time. | ||
|
||
Once the secondary is activated and has bootstrapped, it will be ready for | ||
service and will maintain state with the primary. It is safe to seal/shutdown | ||
the primary and/or secondary; when both are available again, they will | ||
synchronize back into a replicated state. | ||
|
||
Note: if the secondary is in an HA cluster, you will need to ensure that each | ||
standby is sealed/unsealed with the new (primary’s) unseal keys. If one of the | ||
standbys takes over on active duty before this happens it will seal itself to | ||
remove it from rotation (e.g. if using Consul for service discovery), but if a | ||
standby does not attempt taking over it will throw errors. We hope to make this | ||
workflow better in a future update. | ||
|
||
### Dev-Mode Root Tokens | ||
|
||
To ease development and testing, when both the primary and secondary are | ||
running in development mode, the initial root token created by the primary | ||
(including those with custom IDs specified with `-dev-root-token-id`) will be | ||
populated into the secondary upon activation. This allows a developer to keep a | ||
consistent `~/.vault-token` file or `VAULT_TOKEN` environment variable when | ||
working with both clusters. | ||
|
||
On a production system, after a secondary is activated, the enabled | ||
authentication backends should be used to get tokens with appropriate policies | ||
as policies and auth backend configuration are replicated. | ||
|
||
The generate-root command can be also be used to generate a root token local to | ||
the secondary cluster. | ||
|
||
## Managing Vault Replication | ||
|
||
Vault’s replication model is intended to allow horizontally scaling Vault’s | ||
functions rather than to act in a strict Disaster Recovery (DR) capacity. As a | ||
result, Vault replication acts on static items within Vault, meaning | ||
information that is not part of Vault’s lease-tracking system. In a practical | ||
sense, this means that all Vault information is replicated from the primary to | ||
secondaries except for tokens and secret leases. | ||
|
||
Because token information must be checked and possibly rewritten with each use | ||
(e.g. to decrement its use count), replicated tokens would require every call | ||
to be forwarded to the primary, decreasing rather than increasing total Vault | ||
throughput. | ||
|
||
Secret leases are tracked independently for two reasons: one, because every | ||
such lease is tied to a token and tokens are local to each cluster; and two, | ||
because tracking large numbers of leases is memory-intensive and tracking all | ||
leases in a replicated fashion could dramatically increase the memory | ||
requirements across all Vault nodes. | ||
|
||
We believe that this replication model provides significant utility and the | ||
benefits of horizontally scaling Vault’s functionality dramatically outweigh | ||
the drawbacks of not providing a full DR-ready system. However, it does mean | ||
that certain principles must be kept in mind. | ||
|
||
### Always Use the Local Cluster | ||
|
||
First and foremost, when designing systems to take advantage of replicated | ||
Vault, you must ensure that they always use the same Vault cluster for all | ||
operations, as only that cluster will know about the client’s Vault token. | ||
|
||
### Enabling a Secondary Wipes Storage | ||
|
||
Replication relies on having a shared keyring between primary and secondaries | ||
and also relies on having a shared understanding of the data store state. As a | ||
result, when replication is enabled, all of the secondary’s existing storage | ||
will be wiped. This is irrevocable. Make a backup first if there is a remote | ||
chance you’ll need some of this data at some future point. | ||
|
||
Generally, activating as a secondary will be the first thing that is done upon | ||
setting up a new cluster for replication. | ||
|
||
### Replicated vs. Local Backend Mounts | ||
|
||
All backend mounts (of all types) that can be enabled within Vault default to | ||
being mounted as a replicated mount. This means that mounts cannot be enabled | ||
on a secondary, and mounts enabled on the primary will replicate to | ||
secondaries. | ||
|
||
Mounts can also be marked local (via the `-local` flag on the Vault CLI or | ||
setting the `local` parameter to `true` in the API). This can only be performed | ||
at mount time; if a mount is local but should have been replicated, or vice | ||
versa, you must unmount the backend and mount a new instance at that path with | ||
the local flag enabled. | ||
|
||
Local mounts do not propagate data from the primary to secondaries, and local | ||
mounts on secondaries do not have their data removed during the syncing | ||
process. The exception is during initial bootstrapping of a secondary from a | ||
state where replication is disabled; all data, including local mounts, is | ||
deleted at this time (as the encryption keys will have changed so data in local | ||
mounts would be unable to be read). | ||
|
||
### Audit Backends | ||
|
||
In normal Vault usage, if Vault has at least one audit backend configured and | ||
is unable to successfully log to at least one backend, it will block further | ||
requests. | ||
|
||
Replicated audit mounts must be able to successfully log on all replicated | ||
clusters. For example, if using the file backend, the configured path must be | ||
able to be written to by all secondaries. It may be useful to use at least one | ||
local audit mount on each cluster to prevent such a scenario. | ||
|
||
### Never Have Two Primaries | ||
|
||
The replication model is not designed for active-active usage and enabling two | ||
primaries should never be done, as it can lead to data loss if they or their | ||
secondaries are ever reconnected. | ||
|
||
### Disaster Recovery | ||
|
||
At the moment, because leases and tokens are not replicated, if you need true | ||
DR, you will need a DR solution per cluster (similar to non-replicated Vault). | ||
|
||
Local backend mounts are not replicated and their use will require existing DR | ||
mechanisms if DR is necessary in your implementation. | ||
|
||
We may pursue a dedicated Disaster Recovery-focused Replication Mode at a | ||
future time. | ||
|
||
|
206 changes: 206 additions & 0 deletions
206
website/source/docs/http/sys-replication-primary.html.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,206 @@ | ||
--- | ||
layout: "http" | ||
page_title: "HTTP API: /sys/repliation/primary" | ||
sidebar_current: "docs-http-replication-primary" | ||
description: |- | ||
The '/sys/replication/primary' endpoint focuses on managing replication behavior for a primary cluster, including management of secondaries. | ||
--- | ||
|
||
# /sys/replication/primary/enable | ||
|
||
## POST | ||
|
||
<dl> | ||
<dt>Description</dt> | ||
<dd> | ||
Enables replication in primary mode. This is used when replication is | ||
currently disabled on the cluster (if the cluster is already a secondary, | ||
it must be promoted). | ||
|
||
Caution: only one primary should be active at a given time. Multiple | ||
primaries may result in data loss! | ||
|
||
</dd> | ||
|
||
<dt>Method</dt> | ||
<dd>POST</dd> | ||
|
||
<dt>URL</dt> | ||
<dd>`/sys/repliation/primary/enable`</dd> | ||
|
||
<dt>Parameters</dt> | ||
<dd> | ||
<ul> | ||
<li> | ||
<span class="param">primary_cluster_addr /span> | ||
<span class="param-flags">optional</span> | ||
Can be used to override the cluster address that the primary gives to | ||
secondary nodes. Useful if the primary’s cluster address is not | ||
directly accessible and must be accessed via an alternate path/address, | ||
such as through a TCP-based load balancer. | ||
</li> | ||
</ul> | ||
</dd> | ||
|
||
<dt>Returns</dt> | ||
<dd> | ||
`200` response code with a warning. | ||
</dd> | ||
</dl> | ||
|
||
# /sys/replication/primary/demote | ||
|
||
## POST | ||
|
||
<dl> | ||
<dt>Description</dt> | ||
<dd> | ||
Demotes a primary cluster to a secondary. This secondary cluster will not | ||
attempt to connect to a primary (see the update-primary call), but will | ||
maintain knowledge of its cluster ID and can be reconnected to the same | ||
replication set without wiping local storage. | ||
</dd> | ||
|
||
<dt>Method</dt> | ||
<dd>POST</dd> | ||
|
||
<dt>URL</dt> | ||
<dd>`/sys/repliation/primary/demote`</dd> | ||
|
||
<dt>Parameters</dt> | ||
<dd> | ||
None | ||
</dd> | ||
|
||
<dt>Returns</dt> | ||
<dd> | ||
`200` response code with a warning. | ||
</dd> | ||
</dl> | ||
|
||
|
||
# /sys/replication/primary/disable | ||
|
||
## POST | ||
|
||
<dl> | ||
<dt>Description</dt> | ||
<dd> | ||
Disable replication entirely on the cluster. Any secondaries will no longer | ||
be able to connect. Caution: re-enabling this node as a primary or secondary | ||
will change its cluster ID; in the secondary case this means a wipe of the | ||
underlying storage when connected to a primary, and in the primary case, | ||
secondaries connecting back to the cluster (even if they have connected | ||
before) will require a wipe of the underlying storage. | ||
</dd> | ||
|
||
<dt>Method</dt> | ||
<dd>POST</dd> | ||
|
||
<dt>URL</dt> | ||
<dd>`/sys/repliation/primary/disable`</dd> | ||
|
||
<dt>Parameters</dt> | ||
<dd> | ||
None | ||
</dd> | ||
|
||
<dt>Returns</dt> | ||
<dd> | ||
`200` response code with a warning.. | ||
</dd> | ||
</dl> | ||
|
||
# /sys/replication/primary/secondary-token | ||
|
||
## GET | ||
|
||
<dl> | ||
<dt>Description</dt> | ||
<dd> | ||
Requires ‘sudo’ capability. Generate a secondary activation token for the | ||
cluster with the given opaque identifier, which must be unique. This | ||
identifier can later be used to revoke a secondary's access. | ||
</dd> | ||
|
||
<dt>Method</dt> | ||
<dd>GET</dd> | ||
|
||
<dt>URL</dt> | ||
<dd>`/sys/replication/primary/secondary-token`</dd> | ||
|
||
<dt>Parameters</dt> | ||
<dd> | ||
<ul> | ||
<li> | ||
<span class="param">id</span> | ||
<span class="param-flags">required</span> | ||
An opaque identifier, e.g. ‘us-east’ | ||
</li> | ||
<li> | ||
<span class="param">ttl</span> | ||
<span class="param-flags">optional</span> | ||
The TTL for the secondary activation token. Defaults to ‘"30m"’. | ||
</li> | ||
</ul> | ||
</dd> | ||
|
||
<dt>Returns</dt> | ||
<dd> | ||
|
||
```javascript | ||
{ | ||
"request_id": "", | ||
"lease_id": "", | ||
"lease_duration": 0, | ||
"renewable": false, | ||
"data": null, | ||
"warnings": null, | ||
"wrap_info": { | ||
"token": "fb79b9d3-d94e-9eb6-4919-c559311133d6", | ||
"ttl": 300, | ||
"creation_time": "2016-09-28T14:41:00.56961496-04:00", | ||
"wrapped_accessor": "" | ||
} | ||
} | ||
``` | ||
|
||
</dd> | ||
</dl> | ||
|
||
# /sys/replication/primary/revoke-secondary | ||
|
||
## POST | ||
|
||
<dl> | ||
<dt>Description</dt> | ||
<dd> | ||
Revoke a secondary’s ability to connect to the primary cluster; the | ||
secondary will immediately be disconnected and will not be allowed to | ||
connect again unless given a new activation token. | ||
</dd> | ||
|
||
<dt>Method</dt> | ||
<dd></dd> | ||
|
||
<dt>URL</dt> | ||
<dd>`/sys/replication/secondary/revoke-secondary`</dd> | ||
|
||
<dt>Parameters</dt> | ||
<dd> | ||
<ul> | ||
<li> | ||
<span class="param">id</span> | ||
<span class="param-flags">required</span> | ||
The identifier used when fetching the secondary token. | ||
</li> | ||
</ul> | ||
</dd> | ||
|
||
<dt>Returns</dt> | ||
<dd> | ||
`200` response code with a warning. | ||
</dd> | ||
</dl> | ||
|
||
|
Oops, something went wrong.