1
1
2
- === Monitoring individual nodes
2
+ == Monitoring individual nodes
3
3
4
4
Cluster Health is at one end of the spectrum -- a very high-level overview of
5
5
everything in your cluster. The _Node Stats_ API is at the other end. It provides
@@ -43,7 +43,7 @@ host, etc). These values are useful for debugging discovery problems, where
43
43
nodes won't join the cluster. Often you'll see that the port being used is wrong,
44
44
or the node is binding to the wrong IP address/interface.
45
45
46
- ==== Indices section
46
+ === Indices section
47
47
48
48
The indices section lists aggregate statistics for all the indices that reside
49
49
on this particular node.
@@ -206,7 +206,7 @@ posting lists, dictionaries and bloom filters. A very large number of segments
206
206
will increase the amount of overhead lost to these data structures, and the memory
207
207
usage can be a handy metric to gauge that overhead.
208
208
209
- ==== OS and Process Sections
209
+ === OS and Process Sections
210
210
211
211
The OS and Process sections are fairly self-explanatory and won't be covered
212
212
in great detail. They list basic resource statistics such as CPU and load. The
@@ -222,7 +222,7 @@ monitoring stack. Some stats include:
222
222
- Swap usage
223
223
- Open file descriptors
224
224
225
- ==== JVM Section
225
+ === JVM Section
226
226
227
227
The JVM section contains some critical information about the JVM process which
228
228
is running Elasticsearch. Most importantly, it contains garbage collection details,
@@ -390,7 +390,7 @@ Our best advice is to collect collection counts and duration periodically (or us
390
390
and keep an eye out for frequent GCs. You can also enable slow-GC logging,
391
391
discussed in <<TODO>>
392
392
393
- ==== Threadpool Section
393
+ === Threadpool Section
394
394
395
395
Elasticsearch maintains a number of threadpools internally. These threadpools
396
396
cooperate to get work done, passing work between each other as necessary. In
@@ -468,7 +468,7 @@ are good to keep an eye on:
468
468
- `search`: all search and query requests
469
469
- `merging`: threadpool dedicated to managing Lucene merges
470
470
471
- ==== FS and Network sections
471
+ === FS and Network sections
472
472
473
473
Continuing down the Node Stats API, you'll see a bunch of statistics about your
474
474
filesystem: free space, data directory paths, disk IO stats, etc. If you are
@@ -508,7 +508,7 @@ keep-alive connections are important for performance, since building up and tear
508
508
down sockets is expensive (and wastes file descriptors). Make sure your clients
509
509
are configured appropriately.
510
510
511
- ==== Circuit Breaker
511
+ === Circuit Breaker
512
512
513
513
Finally, we come to the last section: stats about the field data circuit breaker
514
514
(introduced in <<circuit_breaker>>):
0 commit comments