From 70c2eb8c161fb62a64555a971c6e90a78b3bf5fa Mon Sep 17 00:00:00 2001 From: Anthony Yeh Date: Fri, 23 Sep 2016 10:30:55 -0700 Subject: [PATCH] Clean up docs. (#2079) Quick pass over docs reachable from vitess.io to remove or update anything that was obviously outdated. --- README.md | 51 +++++--------------- doc/BackupAndRestore.md | 34 ++++++++----- doc/ClientLibraries.md | 11 ++--- doc/Reparenting.md | 8 +--- doc/Sharding.md | 30 ------------ doc/UserGuideIntroduction.md | 35 +++++--------- doc/VitessOverview.md | 12 ++++- vitess.io/_config.yml | 2 +- vitess.io/_config_dev.yml | 2 +- vitess.io/_includes/footer.html | 2 +- vitess.io/_layouts/home.liquid | 7 ++- vitess.io/images/kubernetes.svg | 84 +++++++++++++++++++++++++++++++++ 12 files changed, 154 insertions(+), 124 deletions(-) create mode 100644 vitess.io/images/kubernetes.svg diff --git a/README.md b/README.md index bb569d1e5db..5440f57c3d7 100644 --- a/README.md +++ b/README.md @@ -4,58 +4,31 @@ # Vitess -Vitess is a storage platform for scaling MySQL. -It is optimized to run as effectively in cloud architectures as it does on dedicated hardware. -It combines many important features of MySQL with the scalability of a NoSQL database. +Vitess is a database clustering system for horizontal scaling of MySQL +through generalized sharding. -It's been actively developed since 2011, and is currently used as -a fundamental component of YouTube's MySQL infrastructure, serving thousands of -QPS per server. If you want to find out whether Vitess is a good fit for your -project, please visit [vitess.io](http://vitess.io). +By encapsulating shard-routing logic, Vitess allows application code and +database queries to remain agnostic to the distribution of data onto +multiple shards. With Vitess, you can even split and merge shards as your needs +grow, with an atomic cutover step that takes only a few seconds. -There are a couple of videos from [sougou](https://github.com/sougou) that you can watch: -a [short intro](http://youtu.be/midJ6b1LkA0) prepared for Google I/O 2014 -and a more [detailed presentation from @Scale '14](http://youtu.be/5yDO-tmIoXY). +Vitess has been a core component of YouTube's database infrastructure +since 2011, and has grown to encompass tens of thousands of MySQL nodes. -## Documentation - -### Intro - - * [Helicopter overview](http://vitess.io): - high level overview of Vitess that should tell you whether Vitess is for you. - * [Sharding in Vitess](http://vitess.io/user-guide/sharding.html) - -### Using Vitess - - * Getting Started - * [On Kubernetes](http://vitess.io/getting-started/). - * [From the ground up](http://vitess.io/getting-started/local-instance.html). - * [Architecture](http://vitess.io/overview/#architecture): - all Vitess tools and servers. - * [Reparenting](http://vitess.io/doc/Reparenting): - performing master failover. - * [Resharding](http://vitess.io/user-guide/sharding.html#resharding): - adding more shards to your cluster. - * [Schema management](http://vitess.io/doc/SchemaManagement): - managing your database schema using Vitess. - -### Reference - - * [General Concepts](http://vitess.io/overview/concepts.html) - * [Topology Service](http://vitess.io/doc/TopologyService) - * [VTGate V3](http://vitess.io/doc/VTGateV3Features/) +For more about Vitess, please visit [vitess.io](http://vitess.io). ## Contact Ask questions in the [vitess@googlegroups.com](https://groups.google.com/forum/#!forum/vitess) -discussion forum. +discussion forum or on [Gitter](https://gitter.im/youtube/vitess). Subscribe to [vitess-announce@googlegroups.com](https://groups.google.com/forum/#!forum/vitess-announce) +or the [Vitess Blog](http://blog.vitess.io/) for low-frequency updates like new features and releases. ## License -Unless otherwise noted, the vitess source files are distributed +Unless otherwise noted, the Vitess source files are distributed under the BSD-style license found in the LICENSE file. diff --git a/doc/BackupAndRestore.md b/doc/BackupAndRestore.md index 265058f1c03..27af187fcc0 100644 --- a/doc/BackupAndRestore.md +++ b/doc/BackupAndRestore.md @@ -6,12 +6,15 @@ Vitess. Vitess uses backups for two purposes: ## Prerequisites -Vitess stores data backups on a Backup Storage service. Currently, -Vitess supports backups to either [Google Cloud Storage](https://cloud.google.com/storage/) -or any network-mounted drive (such as NFS). The core Vitess software's -[BackupStorage interface](https://github.com/youtube/vitess/blob/master/go/vt/mysqlctl/backupstorage/interface.go) -defines methods for creating, listing, and removing backups. Plugins for other -storage services just need to implement the interface. +Vitess stores data backups on a Backup Storage service, which is +a [pluggable interface](https://github.com/youtube/vitess/blob/master/go/vt/mysqlctl/backupstor age/interface.go). + +Currently, we have plugins for: + +* A network-mounted path (e.g. NFS) +* Google Cloud Storage +* Amazon S3 +* Ceph Before you can back up or restore a tablet, you need to ensure that the tablet is aware of the Backup Storage system that you are using. To do so, @@ -30,9 +33,10 @@ access to the location where you are storing backups. Specifies the implementation of the Backup Storage interface to use.

Current plugin options available are: @@ -40,14 +44,18 @@ access to the location where you are storing backups. -file_backup_storage_root For the file plugin, this identifies the root directory for backups. - - -gcs_backup_storage_project - For the gcs plugin, this identifies the project to use. - -gcs_backup_storage_bucket For the gcs plugin, this identifies the bucket to use. + + -s3_backup_aws_region + For the s3 plugin, this identifies the AWS region. + + + -s3_backup_storage_bucket + For the s3 plugin, this identifies the AWS S3 bucket. + -ceph_backup_storage_config For the ceph plugin, this identifies the path to a text file with a JSON object as configuration. The JSON object requires the following keys: accessKey, secretKey, endPoint and useSSL. Bucket name is computed from keyspace name and is separate for different keyspaces. diff --git a/doc/ClientLibraries.md b/doc/ClientLibraries.md index 5d2b0fe46ae..993e3133748 100644 --- a/doc/ClientLibraries.md +++ b/doc/ClientLibraries.md @@ -47,14 +47,11 @@ Vitess client libraries follow these core principles: * Each client library should support language-specific, idiomatic constructs to simplify application development in that language. * Client libraries should integrate with the following language-specific - database drivers, though this support is not yet provided in some cases: - * Go: [database/sql package](http://golang.org/pkg/database/sql/) (done) + database drivers. For example, current languages implement the following: + * Go: [database/sql package](http://golang.org/pkg/database/sql/) * Java: [JDBC](https://docs.oracle.com/javase/tutorial/jdbc/index.html) - compliance (in progress) * PHP: [PHP Data Objects \(PDO\)](http://php.net/manual/en/intro.pdo.php) - compliance (in progress) - * Python: [DB API](https://www.python.org/dev/peps/pep-0249/) compliance - (done) + * Python: [PEP 0249 DB API](https://www.python.org/dev/peps/pep-0249/) * Libraries provide a thin wrapper around the proto3 service definitions. Those wrappers could be extended with adapters to higher level libraries like SQLAlchemy (Python) or JDBC (Java), with other object-based helper @@ -86,7 +83,7 @@ Alternatively, you can set the [command line flag "vtgate_protocol"](https://github.com/youtube/vitess/blob/ff800b2a1801f0bb8b0c29a701d9c0988bf827e2/go/vt/vtgate/vtgateconn/vtgateconn.go#L27) to "grpc". -The Go client interface has multiple Execute* methods for different use-cases +The Go client interface has multiple `Execute*()` methods for different use-cases and sharding configurations. When you start off with an unsharded database, we recommend to use the [ExecuteShards method](https://godoc.org/github.com/youtube/vitess/go/vt/vtgate/vtgateconn#VTGateConn.ExecuteShards) diff --git a/doc/Reparenting.md b/doc/Reparenting.md index eb9795cd2fc..43cccf7e290 100644 --- a/doc/Reparenting.md +++ b/doc/Reparenting.md @@ -22,7 +22,7 @@ replicate from that master. ## MySQL requirements -Vitess supports [MySQL 5.6](https://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html), [MySQL 5.7](https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-howto.html) and [MariaDB](https://mariadb.com/kb/en/mariadb/global-transaction-id/) implementations. +Vitess supports [MySQL 5.6](https://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html), [MySQL 5.7](https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-howto.html) and [MariaDB 10.0](https://mariadb.com/kb/en/mariadb/global-transaction-id/) implementations. ### GTIDs Vitess requires the use of global transaction identifiers @@ -176,11 +176,7 @@ by starting vtctld with the --disable\_active\_reparents flag set to true. (You cannot set the flag after vtctld is started.) -## Reparenting And Serving Graph - -During the reparenting process, Vitess shuffles servers such that servers -might be demoted, or promoted. The **serving graph** should -reflect the latest state of the service. +## Fixing Replication A tablet can be orphaned after a reparenting if it is unavailable when the reparent operation is running but then recovers later on. diff --git a/doc/Sharding.md b/doc/Sharding.md index f5680da0557..6bb2787b278 100644 --- a/doc/Sharding.md +++ b/doc/Sharding.md @@ -58,36 +58,6 @@ a single shard in the application's "user" keyspace. On the other hand, a query that retrieves information about several products might be directed to one or more shards in the application's "product" keyspace. -### Sharding Keys - -As discussed above, Vitess calculates the sharding keys associated -with any particular query and then routes the query to the appropriate -shards. - -Vitess supports two types of sharding keys: - -* **Binary data:** The key is an array of bytes. Vitess uses regular - byte-array comparison to determine which shard should handle the - query. The MySQL representation for this type of sharding key is - a VARBINARY field. - -* **64-bit unsigned integer:** Vitess converts the 64-bit integer into - a byte array by copying the bytes, most significant byte first, - into 8 bytes. Vitess then uses byte-array comparison to identify the - right shards to handle the query. The MySQL representation for this - type of sharding key is a bigint(20) UNSIGNED field. - -A sharded keyspace contains information about the type of sharding key -that the keyspace uses. Each database table in the shard has a column -that stores the sharding key associated with each row in the table. The -sharding key column in each table has the same name and column type. - -A common example of a sharding key is the 64-bit hash of a user ID. The -hashing function ensures that the sharding keys are evenly distributed -in the space. - -**Note:** If the vtgate v3 API is used, the sharding key value is no longer materialized. Instead, vtgate can calculate it on the fly when reading and inserting data. (A valid VSchema is required to tell vtgate how to calculate the sharding key value.) - ### Key Ranges and Partitions Vitess uses key ranges to determine which shards should handle any diff --git a/doc/UserGuideIntroduction.md b/doc/UserGuideIntroduction.md index 074c5fb5ead..b887bfb4455 100644 --- a/doc/UserGuideIntroduction.md +++ b/doc/UserGuideIntroduction.md @@ -1,20 +1,13 @@ ## Platform support -Vitess runs on either Ubuntu 14.04 (Trusty) or Debian 7.0 (Wheezy). -You can run Vitess on local hardware or as the storage engine in a -Kubernetes cluster. +We continuously test against Ubuntu 14.04 (Trusty) and Debian 8 (Jessie). +Other Linux distributions should work as well. ## Database support -Vitess supports [MySQL 5.6](http://dev.mysql.com/doc/refman/5.6/en/) -and [MariaDB 10.0](https://downloads.mariadb.org/mariadb/10.0.21/) -implementations. - -**Note:** If you are using MariaDB, you must install version 10.0 or -higher. If you are using an apt repository, confirm that -it offers an option to install that version. You can also download the -source directly from -[mariadb.org](https://downloads.mariadb.org/mariadb/10.0.21/). +Vitess supports [MySQL 5.6](http://dev.mysql.com/doc/refman/5.6/en/), +[MySQL 5.7](http://dev.mysql.com/doc/refman/5.7/en/), +and [MariaDB 10.0](https://downloads.mariadb.org/mariadb/10.0.21/). ### Data types and SQL support @@ -22,13 +15,10 @@ In Vitess, database tables are like MySQL relational tables, and you can use relational modeling schemes (normalization) to structure your schema. Vitess supports both primary and secondary indexes. -Vitess supports all MySQL data types, which translate into almost all -usual scalar data types. It also provides full SQL support within a +Vitess supports almost all MySQL scalar data types. +It also provides full SQL support within a [shard](/overview/concepts.html#shard), including JOIN statements. -The maximum size/value is 16MB per cell/row. In addition, the limit -on the total database size is in the tens of TB. - Vitess does not currently support encoded protobufs or protocol buffer querying. (The latter is also known as cracking.) Protocol buffers can be stored as a blob in MySQL, but must be decoded and interpreted at @@ -81,16 +71,15 @@ client libraries and other clients that Vitess supports. | Type | Options | | :-------- | :--------- | -| Client library | [gRPC](http://www.grpc.io/)
C++
Go
Java
Python | +| Client library | [gRPC](http://www.grpc.io/)
Go
Java
Python
PHP | | MapReduce | [Hadoop input](https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/mapreduce/InputFormat.html) | -| Cloud Dataflow | **_coming soon_** ## Backups -Vitess supports data backups to an NFS directory and can use any -network-mounted drive as the backup repository. Vitess defines an -interface that, in turn, defines methods for creating, listing, -and removing backups. +Vitess supports data backups to either a network mount (e.g. NFS) or to a blob store. +Backup storage is implemented through a pluggable interface, +and we currently have plugins available for Google Cloud Storage, Amazon S3, +and Ceph. See the [Backing Up Data](/user-guide/backup-and-restore.html) section of this guide for more information about creating and restoring data diff --git a/doc/VitessOverview.md b/doc/VitessOverview.md index 5e367dc315a..fa6927d4614 100644 --- a/doc/VitessOverview.md +++ b/doc/VitessOverview.md @@ -6,9 +6,17 @@ traffic since 2011. ## Vitess on Kubernetes -Kubernetes is an open-source orchestration system for Docker containers, and Vitess is the logical storage engine choice for Kubernetes users. +[Kubernetes](http://kubernetes.io/) is an open-source orchestration system for Docker containers, and Vitess can run as a Kubernetes-aware cloud native distributed database. -Kubernetes handles scheduling onto nodes in a compute cluster, actively manages workloads on those nodes, and groups containers comprising an application for easy management and discovery. Using Kubernetes, you can easily create and manage a Vitess cluster, out of the box. +Kubernetes handles scheduling onto nodes in a compute cluster, actively manages workloads on those nodes, and groups containers comprising an application for easy management and discovery. +This provides an analogous open-source environment to the way Vitess runs in YouTube, +on the [predecessor to Kubernetes](http://blog.kubernetes.io/2015/04/borg-predecessor-to-kubernetes.html). + +
+ + +Quickstart +
## Comparisons to other storage options diff --git a/vitess.io/_config.yml b/vitess.io/_config.yml index af4098d2225..9b7b3a593d8 100644 --- a/vitess.io/_config.yml +++ b/vitess.io/_config.yml @@ -1,7 +1,7 @@ # Site wide configuration title: Vitess -description: "Servers and tools that scale MySQL databases for the web in cloud architectures or on dedicated hardware." +description: "Vitess is a database clustering system for horizontal scaling of MySQL." logo: vitess-logo-large-cropped-2.png teaser: 400x250.gif locale: en_US diff --git a/vitess.io/_config_dev.yml b/vitess.io/_config_dev.yml index 8f1158b51f2..3f48e0230aa 100644 --- a/vitess.io/_config_dev.yml +++ b/vitess.io/_config_dev.yml @@ -1,7 +1,7 @@ # Site wide configuration title: Vitess -description: "Servers and tools that scale MySQL databases for the web in cloud architectures or on dedicated hardware." +description: "Vitess is a database clustering system for horizontal scaling of MySQL." logo: vitess-logo-large-cropped-2.png teaser: 400x250.gif locale: en_US diff --git a/vitess.io/_includes/footer.html b/vitess.io/_includes/footer.html index eaa5c2e5ef7..d3234cfadce 100644 --- a/vitess.io/_includes/footer.html +++ b/vitess.io/_includes/footer.html @@ -3,6 +3,6 @@ {% assign separator = '  ยท  ' %} Contact: vitess@googlegroups.com{{ separator }} Announcements{{ separator }} - © {{ site.time | date: '%Y' }} Vitess powered by Google Inc + © {{ site.time | date: '%Y' }} Vitess powered by Google Inc diff --git a/vitess.io/_layouts/home.liquid b/vitess.io/_layouts/home.liquid index 0273b75b40c..fa468038800 100644 --- a/vitess.io/_layouts/home.liquid +++ b/vitess.io/_layouts/home.liquid @@ -7,7 +7,12 @@ layout: base

{{ site.description }}

-

Start Using {{ site.title }}

+

+ + Quickstart + Manual Build + Learn More +

diff --git a/vitess.io/images/kubernetes.svg b/vitess.io/images/kubernetes.svg new file mode 100644 index 00000000000..bedd3b88e43 --- /dev/null +++ b/vitess.io/images/kubernetes.svg @@ -0,0 +1,84 @@ + + + + + + + + + + image/svg+xml + + + + + + + + + + + +