Skip to content

Commit 8f4f844

Browse files
DaveCTurnerjrodewig
andcommitted
Add docs for filesystem health checks (#59134)
Documents the feature and settings introduced in #52680. Co-authored-by: James Rodewig <james.rodewig@elastic.co>
1 parent 664b546 commit 8f4f844

File tree

2 files changed

+26
-0
lines changed

2 files changed

+26
-0
lines changed

docs/reference/modules/discovery/discovery-settings.asciidoc

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,3 +245,22 @@ WARNING: This setting replaces the `discovery.zen.no_master_block` setting in
245245
earlier versions. The `discovery.zen.no_master_block` setting is ignored.
246246

247247
--
248+
249+
`monitor.fs.health.enabled`::
250+
251+
(<<cluster-update-settings,Dynamic>>, boolean) If `true`, the node runs
252+
periodic <<cluster-fault-detection-filesystem-health,filesystem health
253+
checks>>. Defaults to `true`.
254+
255+
`monitor.fs.health.refresh_interval`::
256+
257+
(<<time-units, Time value>>) Interval between successive
258+
<<cluster-fault-detection-filesystem-health,filesystem health checks>>.
259+
Defaults to `2m`.
260+
261+
`monitor.fs.health.slow_path_logging_threshold`::
262+
263+
(<<time-units, Time value>>) If a
264+
<<cluster-fault-detection-filesystem-health,filesystem health checks>>
265+
takes longer than this threshold then {es} logs a warning. Defaults to
266+
`5s`.

docs/reference/modules/discovery/fault-detection.asciidoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,10 @@ Similarly, if a node detects that the elected master has disconnected, this
1818
situation is treated as an immediate failure. The node bypasses the timeout and
1919
retry settings and restarts its discovery phase to try and find or elect a new
2020
master.
21+
22+
[[cluster-fault-detection-filesystem-health]]
23+
Additionally, each node periodically verifies that its data path is healthy by
24+
writing a small file to disk and then deleting it again. If a node discovers
25+
its data path is unhealthy then it is removed from the cluster until the data
26+
path recovers. You can control this behavior with the
27+
<<modules-discovery-settings,`monitor.fs.health` settings>>.

0 commit comments

Comments
 (0)