Skip to content

Commit 3edc67c

Browse files
Add DR Arch Options, Multi-Region Deployment, Multi-Region Failover (#309)
Co-authored-by: raj <raj@turbot.com>
1 parent 380054b commit 3edc67c

File tree

21 files changed

+902
-5
lines changed

21 files changed

+902
-5
lines changed
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
title: Architecture Options
3+
sidebar_label: Architecture Options
4+
---
5+
6+
# Architecture Options
7+
8+
In this guide, you will:
9+
10+
- Explore architectural considerations for deploying Turbot Guardrails.
11+
- Understand different options available based on organizational risk and availability requirements.
12+
13+
14+
Turbot Guardrails is a comprehensive governance platform that automates discovery, compliance, security, and operational remediation tasks across cloud environments. Due to its critical role as a security and compliance control plane, it's essential to configure Guardrails with high availability and disaster recovery in mind.
15+
16+
This document outlines various architectural options to help you select an approach aligned with your organization's specific high availability (HA) and disaster recovery (DR) needs, based on your risk tolerance and operational requirements.
17+
18+
19+
| Tier | Account | Region | Availability Zone | Availability | RTO | RPO | Use Cases |
20+
|----------|---------------|-----------------|-------------------|--------------|-----|-----|----------------------------------------------|
21+
| Tier1 | Single-account | Single-region | Single-AZ | 99% | 4 Hr | 4 Hr | Development and non-prod environments |
22+
| Tier2 | Single-account | Single-region | Multi-AZ | 99.9% | 4 Hr | 4 Hr | Production without rapid DR requirements |
23+
| Tier3 | Single-account | Multi-region | Multi-AZ | 99.9% | 2 Hr | 2 Hr | Production requiring rapid DR |
24+
| Tier4 | Multi-account | Multi-region | Multi-AZ | 99.99% | 0 Hr | 0 Hr | Mandated zero downtime DR |
25+
26+
<!-- - **Tier 1** – Single-account, single-region, single availability zone.
27+
28+
- 99% Availability
29+
- RTO: 4 Hr.
30+
- RPO: 4 Hr.
31+
- Use cases: Development and non-prod environments
32+
33+
- **Tier 2** – Single-account, single-region, multi-availability zone.
34+
35+
- 99.9% Availability
36+
- RTO: 4 Hr.
37+
- RPO: 4 Hr.
38+
- Use cases: Production deployments without need for rapid DR
39+
40+
- **Tier 3** – Single-account, multi-region, multi-availability zone.
41+
42+
- 99.9% Availability
43+
- RTO: 2 Hr.
44+
- RPO: 2 Hr.
45+
- Use cases: Production deployments with need for rapid DR
46+
47+
- **Tier 4** – Multi-account, multi-region, multi-availability zone.
48+
- 99.99% Availability
49+
- RTO: 0 Hr.
50+
- RPO: 0 Hr.
51+
- Use cases: Mandated zero downtime DR -->
52+
53+
## Tier 1: Development
54+
55+
**Key Characteristics**: Single-account, single-region, single availability zone.
56+
57+
This deployment option is appropriate for non-production and development workspaces, where high-availability and disaster recovery are not important for the accounts monitored by guardrails.
58+
59+
This is the lowest cost infrastructure deployment option available.
60+
61+
![Tier 1 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-1.png)
62+
63+
This deployment uses one primary RDS instance without a failover configuration. Recovery can be performed from RDS point-in-time backups.
64+
65+
## Tier 2: High Availability
66+
67+
**Key Characteristics**: Single-account, single-region, multi-availability zone.
68+
69+
This deployment option is appropriate for all production usage. It is the most cost-effective deployment option for production use cases and has the capability to achieve 4hr RPO/RTO in all circumstances except the loss of an entire AWS Region.
70+
71+
![Tier 2 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-2.png)
72+
73+
The changes in this deployment vs the **Tier 1 DR** architecture are:
74+
75+
1. The ECS compute cluster is deployed across multiple availability zones.
76+
2. Lambda are deployed across multiple availability zones.
77+
3. An RDS failover instance is deployed in a second availability zone.
78+
4. An Elasticache failover instance is deployed in a second availability zone.
79+
80+
## Tier 3: Multi-Region
81+
82+
**Key Characteristics**: Single-account, multi-region, multi-availability zone.
83+
84+
This deployment option is appropriate when regulatory requirements demand that a multi-region solution be implemented, or when requirements drive less than a 4hr RTO/RPO. It has the benefit of being resilient to the loss of an entire AWS Region.
85+
86+
![Tier 3 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-3.png)
87+
88+
The key difference between this deployment is that a second Turbot Guardrails deployment is created in the standby region. The compute cluster will be set to be dormant, and no inbound events will be received by the cluster. On declaration of a disaster, DNS will be changed to send events to this region, while the database is recovered from a cross region RDS snapshot. Once the DB is recovered, the workspace is enabled, and events will start processing from the queue.
89+
90+
To use this pattern, [cross-region RDS backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReplicateBackups.html) must be configured in this account to ensure the DB can be restored in the target region without access to KMS in the primary region. This option also requires the use of AWS API Gateway, and a public DNS endpoint and SSL certificate to allow redirection of inbound real-time events between regions.
91+
92+
## Tier 4: Multi-Account
93+
94+
**Key Characteristics**: Multi-account, multi-region, multi-availability zone.
95+
96+
The **Tier 4** deployment option should be considered for any organization with zero RTO/RPO requirements. This deployment option allows for instantaneous failover between two active Guardrails environments. We use the “Change Window” feature of guardrails to prevent one of the implementations from executing any enforcements. Upon declaration of an emergency, the standby environment change window can be removed allowing that environment to become the primary and enforce changes.
97+
98+
In normal day to day operation, both environments consume cloud events and maintain independent CMDB databases. This pattern results in both doubling the infrastructure and per control usage costs for Guardrails if employed.
99+
100+
![Tier 4 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-4.png)
101+
102+
Care must be made in this configuration to ensure that policy packs and account onboarding/offboarding is done across both environments in tandem, using the Guardrails Terraform provider to maintain consistency between the deployments. Custom scripting may be necessary to periodically check to ensure both environments are identical in configuration, to meet your organizations DR requirements.
850 KB
Loading
85.4 KB
Loading
110 KB
Loading
94.4 KB
Loading

docs/guides/hosting-guardrails/disaster-recovery/index.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,9 +55,13 @@ This section provides detailed step-by-step instructions on how to use DR featur
5555

5656
| Guide | Description
5757
| - | -
58-
| [Hive Restore](guides/hosting-guardrails/disaster-recovery/restore) | Guides to restore a Guardrails database from RDS snapshot.
59-
| [DR Testing](guides/hosting-guardrails/disaster-recovery/dr-testing) | Guides to restore a destroyed workspace.
60-
| [Database Upgrade and Storage Optimization](guides/hosting-guardrails/disaster-recovery/database-upgrade-storage-optimization) | Guides to resize and/or upgrade a database engine version with minimal downtime.
58+
| [Architecture Options](guides/hosting-guardrails/disaster-recovery/architecture-options) | Architecture Options.
59+
| [Hive Restore](guides/hosting-guardrails/disaster-recovery/hive-restore) | Guides to restore a Guardrails database from RDS snapshot.
60+
| [Workspace Restore](guides/hosting-guardrails/disaster-recovery/restore-workspace) | Guides to restore a destroyed workspace.
61+
| [Multi-Region Deployment](guides/hosting-guardrails/disaster-recovery/multi-region-deployment) | Guides to set up a multi-region deployment of Turbot Guardrails using Tier 3 architecture.
62+
| [Multi-Region Failover](guides/hosting-guardrails/disaster-recovery/multi-region-failover) | Guides to set up Disaster Recovery (DR) failover for Turbot Guardrails Multi-Region deployment.
63+
64+
<!-- | [Database Upgrade and Storage Optimization](guides/hosting-guardrails/disaster-recovery/database-upgrade-storage-optimization) | Guides to resize and/or upgrade a database engine version with minimal downtime. -->
6165

6266
## Additional Assistance
6367

384 KB
Loading
234 KB
Loading
137 KB
Loading

0 commit comments

Comments
 (0)