@@ -8,12 +8,13 @@ Shard Cluster Architectures
88
99.. default-domain:: mongodb
1010
11- This document describes various ways to deploy a :term:`shard cluster`.
11+ This document describes the organization and design of :term:`shard
12+ cluster` deployments.
1213
1314.. seealso:: The :doc:`/administration/sharding` document, the
1415 ":ref:`Sharding Requirements <sharding-requirements>`" section,
1516 and the ":ref:`Sharding Tutorials <sharding-tutorials>`" for more
16- information on deploying and maintaing a :term:`shard cluster`.
17+ information on deploying and maintaining a :term:`shard cluster`.
1718
1819Deploying A Test Cluster
1920------------------------
@@ -73,18 +74,15 @@ a production-level shard cluster should have the following:
7374 you may deploy several `mongos` nodes and let your application connect
7475 to these via a load balancer.
7576
76- TODO: I don't see how this next section fits in. You've explained all of these
77- elsewhere. I vote for removing it.
78-
7977Sharded and Non-Sharded Data
8078----------------------------
8179
82- Sharding is always enabled on the collection level. You can shard
83- multiple collections within a database, or have multiple databases
84- with sharding enabled. [#sharding-databases]_ However, in production
80+ Sharding operates on the collection level. You can shard multiple
81+ collections within a database, or have multiple databases with
82+ sharding enabled. [#sharding-databases]_ However, in production
8583deployments some databases and collections will use sharding, while
8684other databases and collections will only reside on a single database
87- instance or replica set (i.e. a :term:`shard`.)
85+ instance or replica set (i.e. a :term:`shard`.)
8886
8987.. note::
9088
@@ -112,85 +110,11 @@ When you deploy a new :term:`shard cluster`, the "first shard" becomes
112110the primary for all databases before enabling sharding. Databases
113111created subsequently, may reside on any shard in the cluster.
114112
115- TODO is there anything more to say on which shard a new database will use as primary?
116-
117113.. [#sharding-databases] As you configure sharding, you will use the
118114 :dbcommand:`enablesharding` command to enable sharding for a
119115 database. This simply makes it possible to use the
120116 :dbcommand:`shardcollection` on a collection within that database.
121117
122118.. [#overloaded-primary-term] The term "primary" in the context of
123119 databases and sharding, has nothing to do with the term
124- :term:`primary` in the context of :term:`replica sets <replica set>`
125-
126-
127- TODO: also not really sure about the next section.
128- QUESTION: why discuss backups here? Backups are already descrived elsewhere.
129-
130- Replication and Data Integrity
131- ------------------------------
132-
133- Production :term:`shard clusters` should run each
134- :term:`shard` as a :term:`replica set`. This ensures each
135- each shard remains available in the event of a failure.
136-
137- It's also important to run exactly three :term:`config database` instances.
138- All nodes in the shard cluster update these databases using a two-phase
139- commit, thus guaranteeing consistency. Do note that config databases do not
140- operate as replica sets.
141-
142- Because the shard cluster remains largely operational [#read-only]
143- without one of the config database :program:`mongod` instances,
144- creating a backup of the cluster metadata from the config database is
145- very straightforward:
146-
147- #. Shut down one of the :term:`config databases`.
148-
149- #. Create a full copy of the data files (i.e. the path specified by
150- the :setting:`dbpath` option for the config instance.
151-
152- #. Restart the original configuration server.
153-
154- Furthermore, because the activity to the config servers is minimal
155- creating a :doc:`backup </administration/backups>` of the config
156- instance is straightforward, precise, and non-disruptive.
157-
158- TODO: this is mentioned elsewhere.
159-
160- .. [#read-only] While one of the three config servers unavailable, no
161- the cluster cannot split any chunks nor can it migrate chunks
162- between shards. Your application will be able to write data to the
163- cluster. The ":ref:`sharding-config-server`" section provides more
164- information on this topic.
165-
166- .. _sharding-capacity-planning:
167-
168- Capacity Planning
169- -----------------
170-
171- :term:`Sharding` makes it possible for MongoDB to support very large
172- data sets and workload with very little additional administrative
173- overhead. At the same time, when designing and administering a
174- :term:`shard cluster`, capacity planning remains very important. The
175- key to a successful shard cluster revolves around knowing when
176- sharding is appropriate for your data and in knowing when to add
177- capacity to the system:
178-
179- TODO: this point goes under "when to shard."
180-
181- #. Sharding adds complexity. If you can provision
182- hardware powerful enough to support your data and workload,
183- then you may not need sharding. By not using sharding,
184- your database will have fewer moving parts and will be easier
185- to administer.
186-
187- #. That said, you should consider sharding if you know that the hardware
188- you'll be deploying with cannot handle the expected load.
189-
190- #. In this case, it's important that you enable sharding *before* your system has
191- reached capacity. If you wait until your hardware is already overloaded,
192- then the overhead require to migrate data between shards will only further
193- load the system.
194-
195- #. If you do plan to shard, you should also give some thought to which collections
196- you'll want to shard along with the corresponding shard keys.
120+ :term:`primary` in the context of :term:`replica sets <replica set>`.
0 commit comments