Skip to content

Commit

Permalink
minor changes to architecture docs (#4329)
Browse files Browse the repository at this point in the history
  • Loading branch information
schoudhury authored Apr 28, 2020
1 parent 1cb5dbb commit 780dfe7
Show file tree
Hide file tree
Showing 19 changed files with 130 additions and 130 deletions.
20 changes: 10 additions & 10 deletions docs/content/latest/architecture/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
title: Architecture
headerTitle: Architecture
linkTitle: Architecture
description: Learn about the YugabyteDB architecture, including query, sharding, replication, transactions, and storage layers.
description: Learn about the YugabyteDB architecture, including query, transactions, sharding, replication, and storage layers.
image: /images/section_icons/index/architecture.png
headcontent: YugabyteDB architecture including the query, sharding, replication, transactions, and storage layers.
headcontent: YugabyteDB architecture including the query, transactions, sharding, replication, and storage layers.
section: CONCEPTS
menu:
latest:
Expand All @@ -30,7 +30,7 @@ menu:
<a class="section-link icon-offset" href="concepts/">
<div class="head">
<img class="icon" src="/images/section_icons/architecture/concepts.png" aria-hidden="true" />
<div class="articles">4 articles</div>
<div class="articles">3 articles</div>
<div class="title">Key concepts</div>
</div>
<div class="body">
Expand Down Expand Up @@ -81,8 +81,8 @@ menu:
<a class="section-link icon-offset" href="transactions/">
<div class="head">
<img class="icon" src="/images/section_icons/architecture/distributed_acid.png" aria-hidden="true" />
<div class="articles">4 articles</div>
<div class="title">Transactions layer</div>
<div class="articles">6 articles</div>
<div class="title">DocDB transactions layer</div>
</div>
<div class="body">
Review transaction model and how transactional consistency is ensured at various isolation levels.
Expand All @@ -94,7 +94,7 @@ menu:
<a class="section-link icon-offset" href="docdb-sharding/">
<div class="head">
<img class="icon" src="/images/section_icons/architecture/distributed_acid.png" aria-hidden="true" />
<div class="articles">4 articles</div>
<div class="articles">3 articles</div>
<div class="title">DocDB sharding layer</div>
</div>
<div class="body">
Expand All @@ -107,7 +107,7 @@ menu:
<a class="section-link icon-offset" href="docdb-replication/">
<div class="head">
<img class="icon" src="/images/section_icons/architecture/distributed_acid.png" aria-hidden="true" />
<div class="articles">4 articles</div>
<div class="articles">3 articles</div>
<div class="title">DocDB replication layer</div>
</div>
<div class="body">
Expand All @@ -120,11 +120,11 @@ menu:
<a class="section-link icon-offset" href="docdb/">
<div class="head">
<img class="icon" src="/images/section_icons/architecture/distributed_acid.png" aria-hidden="true" />
<div class="articles">5 articles</div>
<div class="title">DocDB store</div>
<div class="articles">3 articles</div>
<div class="title">DocDB storage layer</div>
</div>
<div class="body">
Sharding, replication, persistence, performance, and colocated tables.
How persistence storage works in DocDB.
</div>
</a>
</div>
Expand Down
33 changes: 17 additions & 16 deletions docs/content/latest/architecture/design-goals.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,43 +25,40 @@ YugabyteDB offers strong consistency guarantees in the face of a variety of fail

In terms of the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem), YugabyteDB is a CP database (consistent and partition tolerant), but achieves very high availability. The architectural design of YugabyteDB is similar to Google Cloud Spanner, which is also a CP system. The description about [Spanner](https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Spanner-and-the-CAP-Theorem.html) is just as valid for YugabyteDB. The key takeaway is that no system provides 100% availability, so the pragmatic question is whether or not the system delivers availability that is so high that most users no longer have to be concerned about outages. For example, given there are many sources of outages for an application, if YugabyteDB is an insignificant contributor to its downtime, then users are correct to not worry about it.

### Single-key linearizability
### Single-row linearizability

YugabyteDB supports single-key linearizable writes. Linearizability is one of the strongest single-key consistency models, and implies that every operation appears to take place atomically and in some total linear order that is consistent with the real-time ordering of those operations. In other words, the following should be true of operations on a single key:
YugabyteDB supports single-row linearizable writes. Linearizability is one of the strongest single-row consistency models, and implies that every operation appears to take place atomically and in some total linear order that is consistent with the real-time ordering of those operations. In other words, the following should be true of operations on a single row:

- Operations can execute concurrently, but the state of the database at any point in time must appear to be the result of some totally ordered, sequential execution of operations.
- If operation A completes before operation B begins, then B should logically take effect after A.

### Multi-key ACID transactions
### Multi-row ACID transactions

YugabyteDB supports multi-key transactions with both Serializable and Snapshot isolation.
YugabyteDB supports multi-row transactions with both Serializable and Snapshot isolation.

- The [YSQL](../../api/ysql/) API supports both Serializable and Snapshot Isolation using the PostgreSQL isolation level syntax of `SERIALIZABLE` and `REPEATABLE READS` respectively. Note that YSQL Serializable support was added in [v1.2.6](../../releases/v1.2.6/).
- The [YSQL](../../api/ysql/) API supports both Serializable and Snapshot Isolation using the PostgreSQL isolation level syntax of `SERIALIZABLE` and `REPEATABLE READS` (default) respectively.
- The [YCQL](../../api/ycql/dml_transaction/) API supports only Snapshot Isolation using the `BEGIN TRANSACTION` syntax.

{{< tip title="Read more about consistency" >}}

- Achieving [consistency with Raft consensus](../docdb/replication/).
- How [fault tolerance and high availability](../core-functions/high-availability/) are achieved.
- [Single-key linearizable transactions](../transactions/single-row-transactions/) in YugabyteDB.
- [Single-row linearizable transactions](../transactions/single-row-transactions/) in YugabyteDB.
- The architecture of [distributed transactions](../transactions/single-row-transactions/).

{{< /tip >}}

## Query APIs

YugabyteDB does not reinvent storage APIs. It is wire-compatible with existing APIs and extends functionality. The following APIs are supported:
YugabyteDB does not reinvent data client APIs. It is wire-compatible with existing APIs and extends functionality. The following APIs are supported.

- **YSQL** — ANSI-SQL compliant and is wire-compatible with PostgreSQL
- **YCQL** (or the *Yugabyte Cloud Query Language*) which is a semi-relational API with Cassandra roots
### YSQL

### Distributed SQL

The YSQL API is PostgreSQL-compatible as noted before. It reuses PostgreSQL code base.
[YSQL](../../api/ysql/) is a fully-relational SQL API that is wire compatible with the SQL language in PostgreSQL. It is best fit for RDBMS workloads that need horizontal write scalability and global data distribution while also using relational modeling features such as JOINs, distributed transactions and referential integrity (such as foreign keys). Note that [reuses the native query layer](https://blog.yugabyte.com/why-we-built-yugabytedb-by-reusing-the-postgresql-query-layer/) of the PostgreSQL open source project.

- New changes do not break existing PostgreSQL functionality

- Designed with migrations to newer PostgreSQL versions over time as an explicit goal. This means that new features are implemented in a modular fashion in the YugabyteDB codebase to enable rapidly integrating with new PostgreSQL features in an on-going fashion.
- Designed with migrations to newer PostgreSQL versions over time as an explicit goal. This means that new features are implemented in a modular fashion in the YugabyteDB codebase to enable rapid integration with new PostgreSQL features as an ongoing process.

- Support wide SQL functionality:
- All data types
Expand All @@ -74,6 +71,10 @@ The YSQL API is PostgreSQL-compatible as noted before. It reuses PostgreSQL code
- Stored procedures
- Triggers

### YCQL

[YCQL](../../api/ycql/) is a SQL-based flexible-schema API that is best fit for internet-scale OLTP apps needing a semi-relational API highly optimized for write-intensive applications as well as blazing-fast queries. It supports distributed transactions, strongly consistent secondary indexes and a native JSON column type. YCQL has its roots in the Cassandra Query Language.

{{< tip title="Read more" >}}

Understanding [the design of the query layer](../query-layer/overview/).
Expand All @@ -93,9 +94,9 @@ Written in C++ to ensure high performance and the ability to leverage large memo
Achieving [high performance in YugabyteDB](../docdb/performance/).
{{< /tip >}}

## Geo-distributed
## Geo-distributed deployments

### Multi-region deployments
### Multi-region configurations

YugabyteDB should work well in deployments where the nodes of the cluster span:

Expand All @@ -109,7 +110,7 @@ In order to achieve this, a number of features would be required. For example, c
- Cluster-aware, with ability to handle node failures seamlessly
- Topology-aware, with ability to route traffic seamlessly

## Cloud native
## Cloud native architecture

YugabyteDB is a cloud-native database. It has been designed with the following cloud-native principles in mind:

Expand Down
21 changes: 10 additions & 11 deletions docs/content/latest/architecture/docdb-replication/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,27 +6,27 @@ description: Learn about the YugabyteDB distributed document store that is respo
image: /images/section_icons/architecture/concepts.png
aliases:
- /latest/architecture/docdb/replication/
headcontent: YugabyteDB distributed document store responsible for sharding, replication, transactions, and persistence.
headcontent: DocDB is YugabyteDB's distributed document store responsible for transactions, sharding, replication, and persistence.
menu:
latest:
identifier: architecture-docdb-replication
parent: architecture
weight: 1135
---

This section describes how replication works in DocDB. The data in a DocDB table is split into tablets. By default, each tablet is synchronously replicated using the Raft algorithm across various nodes or fault domains (such as availability zones/racks/regions/cloud providers).
{{< note title="Note" >}}

* YugabyteDB's synchronous replication architecture is inspired by <a href="https://research.google.com/archive/spanner-osdi2012.pdf">Google Spanner</a>.
* YugabyteDB asynchronous replication architecture is inspired by RDBMS databases such as Oracle, MySQL and PostgreSQL.

There are other advanced replication features in YugabyteDB. These include two forms of asynchronous replication of data:
* **xCluster Replication** Data is asynchronously replicated between different YugabyteDB clusters - both unidirectional replication (master-slave) or bidirectional replication across two clusters.
* **Read replicas** The in-cluster asynchronous replicas are called read replicas.
{{</note >}}

{{< note title="Note" >}}
This section describes how replication works in DocDB. The data in a DocDB table is split into tablets. By default, each tablet is synchronously replicated using the Raft algorithm across various nodes or fault domains (such as availability zones/racks/regions/cloud providers).

* Synchronous replication in YugabyteDB synchronous replication architecture is inspired by <a href="https://research.google.com/archive/spanner-osdi2012.pdf">Google Spanner</a>.
* Asynchronous replication in YugabyteDB is inspired by RDBMS databases such as Oracle, MySQL and PostgreSQL.
There are other advanced replication features in YugabyteDB. These include two forms of asynchronous replication of data:
* **xCluster replication** Data is asynchronously replicated between different YugabyteDB clusters - both unidirectional replication (master-slave) or bidirectional replication across two clusters.
* **Read replicas** The in-cluster asynchronous replicas are called read replicas.

{{</note >}}

<div class="row">

Expand All @@ -37,12 +37,11 @@ There are other advanced replication features in YugabyteDB. These include two f
<div class="title">Default replication</div>
</div>
<div class="body">
Replicating the data in every table with Raft consensus.
In-cluster synchronous replication with Raft consensus.
</div>
</a>
</div>


<div class="col-12 col-md-6 col-lg-12 col-xl-6">
<a class="section-link icon-offset" href="xcluster-replication/">
<div class="head">
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: xCluster Replication
headerTitle: xCluster Replication
linkTitle: xCluster Replication
title: xCluster replication
headerTitle: xCluster replication
linkTitle: xCluster replication
description: Asynchronous replication between multiple YugabyteDB clusters.
aliases:
- /latest/architecture/docdb/2dc-deployments/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ isTocNested: true
showAsideToc: true
---

DocDB replicates data in order to survive failures while continuing to maintain consistency of
data and not requiring operator intervention.
DocDB automatically replicates data synchronously in order to survive failures while maintaining data consistency and avoiding operator intervention. It does so using the Raft distributed consensus protocol.

## Overview

Expand Down Expand Up @@ -49,7 +48,6 @@ Replication of data in DocDB is achieved at the level of tablets, using **tablet

Each tablet comprises of a set of tablet-peers - each of which stores one copy of the data belonging to the tablet. There are as many tablet-peers for a tablet as the replication factor, and they form a Raft group. The tablet-peers are hosted on different nodes to allow data redundancy on node failures. Note that the replication of data between the tablet-peers is **strongly consistent**.


The figure below illustrates three tablet-peers that belong to a tablet (tablet 1). The tablet-peers are hosted on different YB-TServers and form a Raft group for leader election, failure detection and replication of the write-ahead logs.

![raft_replication](/images/architecture/raft_replication.png)
Expand Down
24 changes: 19 additions & 5 deletions docs/content/latest/architecture/docdb-sharding/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,26 @@ description: Learn about the YugabyteDB distributed document store that is respo
image: /images/section_icons/architecture/concepts.png
aliases:
- /latest/architecture/docdb/sharding
headcontent: How automatic sharding of data works in YugabyteDB.
headcontent: DocDB is YugabyteDB's distributed document store responsible for transactions, sharding, replication, and persistence.
menu:
latest:
identifier: architecture-docdb-sharding
parent: architecture
weight: 1130
---

{{< note title="Note" >}}

YugabyteDB's sharding architecture is inspired by <a href="https://research.google.com/archive/spanner-osdi2012.pdf">Google Spanner</a>.

{{</note >}}

This section describes how sharding works in DocDB. A distributed SQL database needs to automatically partition the data in a table and distribute it across nodes. This is known as data sharding and it can be achieved through different strategies, each with its own tradeoffs.

Data sharding helps in scalability and geo-distribution by horizontally partitioning data. A SQL table is decomposed into multiple sets of rows according to a specific sharding strategy. Each of these sets of rows is called a shard. These shards are distributed across multiple server nodes (containers, VMs, bare-metal) in a shared-nothing architecture. This ensures that the shards do not get bottlenecked by the compute, storage and networking resources available at a single node. High availability is achieved by replicating each shard across multiple nodes. However, the application interacts with a SQL table as one logical unit and remains agnostic to the physical placement of the shards.

DocDB supports range and hash sharding natively.

{{< note title="Note" >}}
Read more about the [tradeoffs in the various sharding strategies considered](https://blog.yugabyte.com/four-data-sharding-strategies-we-analyzed-in-building-a-distributed-sql-database/).
{{</note >}}


<div class="row">
<div class="col-12 col-md-6 col-lg-12 col-xl-6">
Expand All @@ -50,4 +52,16 @@ Read more about the [tradeoffs in the various sharding strategies considered](ht
</a>
</div>

<div class="col-12 col-md-6 col-lg-12 col-xl-6">
<a class="section-link icon-offset" href="colocated-tables/">
<div class="head">
<img class="icon" src="/images/section_icons/explore/linear_scalability.png" aria-hidden="true" />
<div class="title">Colocated tables</div>
</div>
<div class="body">
Scaling number of relations by colocating data.
</div>
</a>
</div
</div>
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ aliases:
menu:
latest:
identifier: docdb-colocated-tables
parent: docdb
weight: 1200
parent: architecture-docdb-sharding
weight: 1144
isTocNested: false
showAsideToc: true
---
Expand Down
6 changes: 3 additions & 3 deletions docs/content/latest/architecture/docdb-sharding/sharding.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Hash and range sharding
headerTitle: Hash and range sharding
linkTitle: Hash and range sharding
title: Hash & range sharding
headerTitle: Hash & range sharding
linkTitle: Hash & range sharding
description: Learn how YugabyteDB uses hash and range sharding for horizontal scaling.
aliases:
- /latest/architecture/docdb/sharding/
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Tablet Splitting
headerTitle: Tablet Splitting
linkTitle: Tablet Splitting
title: Tablet splitting
headerTitle: Tablet splitting
linkTitle: Tablet splitting
description: Learn how YugabyteDB splits tablets.
menu:
latest:
Expand All @@ -16,16 +16,18 @@ Tablet splitting enables changing the number of tablets at runtime. This enables

## Overview

There are a number of scenarios where this is useful:
There are a number of scenarios where this is useful.

### Range Scans
### Range scans
In use-cases that scan a range of data, the data is stored in the natural sort order (also known as range-sharding). In these usage patterns, it is often impossible to predict a good split boundary ahead of time. For example:

```
CREATE TABLE census_stats (
age INTEGER,
user_id INTEGER,
...
);
```
In the table above, it is not possible for the database to infer the range of values for age (typically in the 1 to 100 range). It is also impossible to predict the distribution of rows in the table, meaning how many user_id rows will be inserted for each value of age to make an evenly distributed data split. This makes it hard to pick good split points ahead of time.

### Low-cardinality primary keys
Expand All @@ -35,7 +37,7 @@ In use-cases with a low-cardinality of the primary keys (or the secondary index)
This feature is also useful for use-cases where tables begin small, and thereby start with a few shards. If these tables grow very large, then nodes continuously get added to the cluster. We may reach a scenario where the number of nodes exceeds the number of tablets. Such cases require tablet splitting to effectively re-balance the cluster.


## Ways to split tablets
## Approaches to tablet splitting

DocDB allows the following mechanisms to re-shard data by splitting tablets:

Expand All @@ -58,7 +60,6 @@ As the name suggests, the table is split at creation time into a desired number
num_tablets_in_table = num_tablets_per_node * num_nodes_at_table_creation_time
```
{{< note title="Note" >}}
In order to pre-split a table into a desired number of tablets, we need the start and end key for each tablet. This makes pre-splitting slightly different for hash-sharded and range-sharded tables.
Expand Down Expand Up @@ -203,3 +204,7 @@ There are a few important things to note here.

{{</note >}}

## Dynamic splitting

Dynamic splitting is currently a [work-in-progress](https://github.com/yugabyte/yugabyte-db/issues/1004) feature and is not yet available. Design for this feature can be reviewed on [GitHub](https://github.com/yugabyte/yugabyte-db/blob/master/architecture/design/docdb-automatic-tablet-splitting.md).

Loading

0 comments on commit 780dfe7

Please sign in to comment.