You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kibana is an open source analytics and visualization platform designed to work with Elasticsearch, that makes it easy to perform advanced data analysis and to visualize your data in a variety of charts, tables, and maps. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.
13
+
This section outlines the key tasks and processes required to maintain a healthy, performant, and secure {{es}} infrastructure and its deployments.
9
14
10
-
Most deployment templates include a Kibana instance, but if it wasn’t part of the initial deployment you can go to the **Kibana** page and **Enable** Kibana.
11
-
12
-
The new Kibana instance takes a few moments to provision. After provisioning Kibana is complete, you can use the endpoint URL to access Kibana.
13
-
14
-
::::{tip}
15
-
You can log into Kibana as the `elastic` superuser. The password was provided when you created your deployment or can be [reset](users-roles/cluster-or-deployment-auth/built-in-users.md). On AWS and not able to access Kibana? [Check if you need to update your endpoint URL first](../troubleshoot/deployments/cloud-enterprise/common-issues.md#ece-aws-private-ip).
16
-
::::
17
-
18
-
19
-
From the deployment **Kibana** page you can also:
20
-
21
-
* Terminate your Kibana instance, which stops it. The information is stored in your Elasticsearch cluster, so stopping and restarting should not risk your Kibana information.
22
-
* Restart it after stopping.
23
-
* Upgrade your Kibana instance version if it is out of sync with your Elasticsearch cluster.
24
-
* Delete to fully remove the instance, wipe it from the disk, and stop charges.
15
+
The topics covered include:
25
16
17
+
***[ECE Maintenance](maintenance/ece.md)**: Explains the procedures for maintaining both the host infrastructure and {{es}} deployments within Elastic Cloud Enterprise (ECE).
18
+
***[Start and Stop services](maintenance/start-stop-services.md)**: Provides step-by-step instructions on how to safely start and stop your {{es}} deployment or {{kib}} instance, particularly when performing actions that require a restart.
19
+
***[Add and remove {{es}} nodes](maintenance/add-and-remove-elasticsearch-nodes.md)**: Guides you through the process of enrolling new nodes or safely removing existing ones from a self-managed {{es}} cluster to optimize resource utilization and cluster performance.
You can enroll additional nodes on your local machine to experiment with how an {{es}} cluster with multiple nodes behaves.
28
40
29
41
::::{note}
30
42
To add a node to a cluster running on multiple machines, you must also set [`discovery.seed_hosts`](../deploy/self-managed/important-settings-configuration.md#unicast.hosts) so that the new node can discover the rest of its cluster.
31
43
32
44
::::
33
45
34
-
35
46
When {{es}} starts for the first time, the security auto-configuration process binds the HTTP layer to `0.0.0.0`, but only binds the transport layer to localhost. This intended behavior ensures that you can start a single-node cluster with security enabled by default without any additional configuration.
36
47
37
48
Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically.
@@ -64,21 +75,18 @@ To enroll new nodes in your cluster, create an enrollment token with the `elasti
64
75
65
76
For more information about discovery and shard allocation, refer to [*Discovery and cluster formation*](../distributed-architecture/discovery-cluster-formation.md) and [Cluster-level shard allocation and routing settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md).
As nodes are added or removed Elasticsearch maintains an optimal level of fault tolerance by automatically updating the cluster’s *voting configuration*, which is the set of [master-eligible nodes](../distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role) whose responses are counted when making decisions such as electing a new master or committing a new cluster state.
71
81
72
82
It is recommended to have a small and fixed number of master-eligible nodes in a cluster, and to scale the cluster up and down by adding and removing master-ineligible nodes only. However there are situations in which it may be desirable to add or remove some master-eligible nodes to or from a cluster.
If you wish to add some nodes to your cluster, simply configure the new nodes to find the existing cluster and start them up. Elasticsearch adds the new nodes to the voting configuration if it is appropriate to do so.
78
87
79
88
During master election or when joining an existing formed cluster, a node sends a join request to the master in order to be officially added to the cluster.
When removing master-eligible nodes, it is important not to remove too many all at the same time. For instance, if there are currently seven master-eligible nodes and you wish to reduce this to three, it is not possible simply to stop four of the nodes at once: to do so would leave only three nodes remaining, which is less than half of the voting configuration, which means the cluster cannot take any further actions.
@@ -108,7 +116,6 @@ Although the voting configuration exclusions API is most useful for down-scaling
108
116
Voting exclusions are only required when removing at least half of the master-eligible nodes from a cluster in a short time period. They are not required when removing master-ineligible nodes, nor are they required when removing fewer than half of the master-eligible nodes.
109
117
::::
110
118
111
-
112
119
Adding an exclusion fora node creates an entry for that nodein the voting configuration exclusions list, which has the system automatically try to reconfigure the voting configuration to remove that node and prevents it from returning to the voting configuration once it has removed. The current list of exclusions is stored in the cluster state and can be inspected as follows:
Elastic Cloud Enterprise (ECE), being a self-managed Elastic Stack deployment platform, abstracts much of the complexity of running {{es}}, but still requires regular maintenance at both the platform and deployment levels. Maintenance activities range from managing individual deployments to performing infrastructure-level updates on ECE hosts.
10
+
11
+
## Deployment maintenance and host infrastructure maintenance [ece-deployment-host-infra-maintenance]
[Deployment maintenance](ece/deployments-maintenance.md) focuses on managing individual {{es}} and {{kib}} instances within ECE. This includes actions such as [pausing instances](ece/pause-instance.md), [stopping request routing to nodes](ece/start-stop-routing-requests.md), and [moving instances between allocators](ece/move-nodes-instances-from-allocators.md) to optimize resource usage or prepare for maintenance. These tasks help maintain service availability and performance without affecting the underlying infrastructure.
6
14
7
-
% Scope notes: Introduction about ECE maintenance and activities / actions. Explain the difference between deployments maintenance and ECE hosts infrastructure maintenance.
15
+
[ECE host infrastructure maintenance](ece/perform-ece-hosts-maintenance.md) involves managing virtual machines that host ECE itself. This includes tasks like applying operating system patches, upgrading software, or decommissioning hosts. Infrastructure maintenance often requires more careful planning, as it can impact multiple deployments running on the affected hosts. Methods such as placing allocators into [maintenance mode](ece/enable-maintenance-mode.md)and redistributing workloads provide a smooth transition during maintenance operations.
8
16
9
-
⚠️ **This page is a work in progress.** ⚠️
17
+
This section provides guidance on best practices for both types of maintenance, helping you maintain a resilient ECE environment.
1.[Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
18
21
2. From the **Platform** menu, select **Hosts**.
19
-
20
-
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
22
+
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
21
23
22
24
3. For hosts that hold the allocator role:
23
-
24
-
1.[Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
25
-
2.[Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.
25
+
1.[Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
26
+
2.[Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.
26
27
27
28
4. Go to **Hosts** and select a host.
28
29
5. Select **Manage roles** from the **Manage host** menu and remove all assigned roles.
29
30
6. Select **Demote host** from the **Manage host** menu if present. If the **Delete host** option is already enabled, skip this step.
30
31
7. Remove *all running* containers from the host, starting from the container with name `frc-runners-runner`. Then remove the storage directory (the default `/mnt/data/elastic/`). You can use the recommended [cleanup command](../../uninstall/uninstall-elastic-cloud-enterprise.md). Upon doing so, the UI should reflect the host is **Disconnected**, allowing the host to be deleted.
In some circumstances, you might need to temporarily restrict access to a node so you can perform corrective actions that might otherwise be difficult to complete. For example, if your cluster is being overwhelmed by requests because it is undersized for its workload, its nodes might not respond to efforts to resize.
9
12
10
-
These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming completely unresponsive so that you can resolve operational issues much more effectively.
13
+
These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming unresponsive so that you can resolve operational issues much more effectively.
11
14
12
15
*[**Stop routing to the instance**](start-stop-routing-requests.md): Block requests from being routed to the cluster node. This is a less invasive action than pausing the cluster.
13
16
*[**Pause an instance**](pause-instance.md): Suspend the node immediately by stopping the container that the node runs on without completing existing requests. This is a more aggressive action to regain control of an unresponsive node.
14
17
15
18
As an alternative, to quickly add capacity to a deployment if it is unhealthy or at capacity, you can also [override the resource limit for a deployment](../../deploy/cloud-enterprise/resource-overrides.md).
@@ -12,34 +15,13 @@ To put an allocator into maintenance mode:
12
15
1.[Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
13
16
2. From the **Platform** menu, select **Allocators**.
14
17
3. Choose the allocator you want to work with and select **Enable Maintenance Mode**. Confirm the action.
15
-
16
-
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
17
-
18
+
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
18
19
19
20
After the allocator enters maintenance mode, no new Elasticsearch nodes or Kibana instances will be started on the allocator. Existing nodes will continue to work as expected. You can now safely perform actions like [moving nodes off the allocator](move-nodes-instances-from-allocators.md).
20
21
21
22
If you want to make the allocator fully active again, select **Disable Maintenance Mode**. Confirm the action.
22
23
23
-
::::{tip}
24
-
If you need the existing instances to stop routing requests you can [stop routing requests](deployments-maintenance.md) to disable incoming requests to particular instances. You can also massively disable all allocator instances routing with the [allocator-toggle-routing-requests.sh](https://download.elastic.co/cloud/allocator-toggle-routing-requests.sh) script. The script runs with the following parameters in the form environment variables:
25
-
26
-
*`API_URL` Url of the administration API.
27
-
*`AUTH_HEADER` Curl format string representing the authentication header.
28
-
*`ALLOCATOR_ID` Action target allocator id.
29
-
*`ENABLE_TRAFFIC` Wether traffic to the selected allocator instances should be enabled (`true`) or disabled (`false`).
30
-
31
-
This is an example of script execution to disable routing on all instances running on a given allocator: In this example the script disables routing on all instances running on a given allocator:
If you need the existing instances to stop routing requests, refer to the [stop routing request documentation](start-stop-routing-requests.md) to learn more.
Maintenance activities ensure the smooth operation and scalability of your {{es}} installation. This section provides guidelines on performing essential maintenance tasks while minimizing downtime and maintaining high availability.
Before performing maintenance on an allocator, you should enable maintenance mode to prevent new Elasticsearch clusters and Kibana instances from being provisioned. This ensures that existing deployments can be safely moved to other allocators or adjusted without disruption.
15
+
16
+
### [Scale out installation](scale-out-installation.md)
17
+
18
+
You can scale out your installation by adding capacity to meet growing demand or improve high availability. This process involves installing ECE on additional hosts, assigning roles to new hosts, and resizing deployments to utilize the expanded resources.
19
+
20
+
### [Move nodes and instances between allocators](move-nodes-instances-from-allocators.md)
21
+
22
+
Moving {{es}} nodes, {{kib}} instances, and other components between allocators may be necessary to free up space, avoid downtime, or handle allocator failures. The process involves selecting target allocators and ensuring enough capacity to accommodate the migration.
Maintaining ECE hosts is critical for applying system patches, performing hardware upgrades, and ensuring compliance with security standards. Learn about the various methods of maintaining hosts, and their impact on your ECE installation.
27
+
28
+
### [Delete ECE hosts](delete-ece-hosts.md)
29
+
30
+
If a host is no longer required or is faulty, it can be removed from the Elastic Cloud Enterprise installation. Deleting a host only removes it from the installation but does not uninstall the software from the physical machine. Before deletion, allocators should be placed in maintenance mode, and nodes should be migrated to avoid disruption.
31
+
32
+
## Best practices for maintenance
33
+
34
+
* Always check available capacity before making changes.
35
+
36
+
* Use maintenance mode to avoid unexpected disruptions.
37
+
38
+
* Move nodes strategically to maintain high availability.
39
+
40
+
* Perform maintenance during off-peak hours when possible.
41
+
42
+
* Regularly review and optimize resource allocation.
43
+
44
+
By following these guidelines, you can ensure the stability and efficiency of your environment while carrying out necessary maintenance activities.
0 commit comments