@@ -103,6 +103,8 @@ documentation:
103103 This enables authorized access from the |k8s-op-short| installed in
104104 the central cluster to the member clusters.
105105
106+ .. _multi-cluster-prereqs:
107+
106108Prerequisites
107109-------------
108110
@@ -205,3 +207,59 @@ Procedure
205207---------
206208
207209.. include:: /includes/steps/multi-cluster-beta-quick-start.rst
210+
211+ Troubleshooting Mutli-Cluster Deployments
212+ -----------------------------------------
213+
214+ To troubleshoot your multi-cluster deployments, use the procedures in this
215+ section.
216+
217+ Recovering from Cluster Failure
218+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+ This procedure uses the same cluster names as in the :ref:`Prerequisites <multi-cluster-prereqs>`.
221+ If the cluster ``MDB_CLUSTER_1`` that holds MongoDB nodes goes down, and
222+ if you provision a new cluster named ``MDB_CLUSTER_4`` instead of
223+ ``MDB_CLUSTER_1`` to hold the new MongoDB nodes, run the
224+ :github:`multi-cluster kubeconfig creator </mongodb/mongodb-enterprise-kubernetes/blob/master/tools/multicluster/main.go>`
225+ tool with the updated list of member clusters, and then edit the ``MongoDBMulti``
226+ CustomResource spec on the central cluster.
227+
228+ To reconfigure the multi-cluster deployment after a cluster failure,
229+ replace the failed cluster with the newly provisioned cluster as follows:
230+
231+ 1. Run the :github:`multi-cluster kubeconfig creator </mongodb/mongodb-enterprise-kubernetes/blob/master/tools/multicluster/main.go>`
232+ tool with the new cluster ``MDB_CLUSTER_4`` specified in the
233+ ``-member-clusters`` flag. This enables the |k8s-op-short| to
234+ communicate with the new cluster to schedule MongoDB nodes on it. In
235+ the following example, ``-member-clusters`` contains ``${MDB_CLUSTER_4_FULL_NAME}``.
236+
237+ .. code-block:: sh
238+
239+ go run tools/multicluster/main.go \
240+ -central-cluster="${MDB_CENTRAL_CLUSTER_FULL_NAME}" \
241+ -member-clusters="${MDB_CLUSTER_4_FULL_NAME},${MDB_CLUSTER_2_FULL_NAME},${MDB_CLUSTER_3_FULL_NAME}" \
242+ -member-cluster-namespace="mongodb" \
243+ -central-cluster-namespace="mongodb"
244+
245+ 2. On the central cluster, locate and edit the ``MongoDBMulti``
246+ CustomResource spec to add the new cluster name to the
247+ ``clusterSpecList`` and remove the failed cluster from this list.
248+ The resulting list of cluster names should be similar to the following:
249+
250+ .. code-block:: sh
251+
252+ clusterSpecList:
253+ clusterSpecs:
254+ - clusterName: ${MDB_CLUSTER_4_FULL_NAME}
255+ members: 3
256+ - clusterName: ${MDB_CLUSTER_2_FULL_NAME}
257+ members: 2
258+ - clusterName: ${MDB_CLUSTER_3_FULL_NAME}
259+ members: 3
260+
261+ 3. Restart the |k8s-op-short| Pod. After the restart, the |k8s-op-short|
262+ should reconcile the MongoDB deployment on the newly created
263+ ``MDB_CLUSTER_4`` cluster that has been created as a replacement for
264+ the ``MDB_CLUSTER_1`` failure. To learn more about resource
265+ reconciliation, see :ref:`multi-cluster-diagram`.
0 commit comments