Skip to content

Commit 9386dbc

Browse files
ansjcykolchfa-awsnatebower
authored
add document for Query Insights health_stats API (#8627)
* add document for Query Insights health_stats API Signed-off-by: Chenyang Ji <cyji@amazon.com> * Doc review Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _observing-your-data/query-insights/api.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Move metrics counters section Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Clarification Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Change title of page Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Chenyang Ji <cyji@amazon.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina <kolchfa@amazon.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
1 parent ea3f786 commit 9386dbc

File tree

2 files changed

+123
-0
lines changed

2 files changed

+123
-0
lines changed
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
---
2+
layout: default
3+
title: Query Insights plugin health
4+
parent: Query insights
5+
nav_order: 50
6+
---
7+
8+
# Query Insights plugin health
9+
10+
The Query Insights plugin provides an [API](#health-stats-api) and [metrics](#opentelemetry-error-metrics-counters) for monitoring its health and performance, enabling proactive identification of issues that may affect query processing or system resources.
11+
12+
## Health Stats API
13+
**Introduced 2.18**
14+
{: .label .label-purple }
15+
16+
The Health Stats API provides health metrics for each node running the Query Insights plugin. These metrics allow for an in-depth view of resource usage and the health of the query processing components.
17+
18+
### Path and HTTP methods
19+
20+
```json
21+
GET _insights/health_stats
22+
```
23+
24+
### Example request
25+
26+
```json
27+
GET _insights/health_stats
28+
```
29+
{% include copy-curl.html %}
30+
31+
### Example response
32+
33+
The response includes a set of health-related fields for each node:
34+
35+
```json
36+
PUT _cluster/settings
37+
{
38+
"AqegbPL0Tv2XWvZV4PTS8Q": {
39+
"ThreadPoolInfo": {
40+
"query_insights_executor": {
41+
"type": "scaling",
42+
"core": 1,
43+
"max": 5,
44+
"keep_alive": "5m",
45+
"queue_size": 2
46+
}
47+
},
48+
"QueryRecordsQueueSize": 2,
49+
"TopQueriesHealthStats": {
50+
"latency": {
51+
"TopQueriesHeapSize": 5,
52+
"QueryGroupCount_Total": 0,
53+
"QueryGroupCount_MaxHeap": 0
54+
},
55+
"memory": {
56+
"TopQueriesHeapSize": 5,
57+
"QueryGroupCount_Total": 0,
58+
"QueryGroupCount_MaxHeap": 0
59+
},
60+
"cpu": {
61+
"TopQueriesHeapSize": 5,
62+
"QueryGroupCount_Total": 0,
63+
"QueryGroupCount_MaxHeap": 0
64+
}
65+
}
66+
}
67+
}
68+
```
69+
70+
### Response fields
71+
72+
The following table lists all response body fields.
73+
74+
Field | Data type | Description
75+
:--- |:---| :---
76+
`ThreadPoolInfo` | Object | Information about the Query Insights thread pool, including type, core count, max threads, and queue size. See [The ThreadPoolInfo object](#the-threadpoolinfo-object).
77+
`QueryRecordsQueueSize` | Integer | The size of the queue that buffers incoming search queries before processing. A high value may suggest increased load or slower processing.
78+
`TopQueriesHealthStats` | Object | Performance metrics for each top query service that provide information about memory allocation (heap size) and query grouping. See [The TopQueriesHealthStats object](#the-topquerieshealthstats-object).
79+
80+
### The ThreadPoolInfo object
81+
82+
The `ThreadPoolInfo` object contains the following detailed configuration and performance data for the thread pool dedicated to the Query Insights plugin.
83+
84+
Field | Data type | Description
85+
:--- |:---| :---
86+
`type`| String | The thread pool type (for example, `scaling`).
87+
`core`| Integer | The minimum number of threads in the thread pool.
88+
`max`| Integer | The maximum number of threads in the thread pool.
89+
`keep_alive`| Time unit | The amount of time that idle threads are retained.
90+
`queue_size`| Integer | The maximum number of tasks in the queue.
91+
92+
### The TopQueriesHealthStats object
93+
94+
The `TopQueriesHealthStats` object provides breakdowns for latency, memory, and CPU usage and contains the following information.
95+
96+
Field | Data type | Description
97+
:--- |:---| :---
98+
`TopQueriesHeapSize`| Integer | The heap memory allocation for the query group.
99+
`QueryGroupCount_Total`| Integer | The total number of processed query groups.
100+
`QueryGroupCount_MaxHeap`| Integer | The size of the max heap that stores all query groups in memory.
101+
102+
## OpenTelemetry error metrics counters
103+
104+
The Query Insights plugin integrates with OpenTelemetry to provide real-time error metrics counters. These counters help to identify specific operational failures in the plugin and improve reliability. Each metric provides targeted insights into potential error sources in the plugin workflow, allowing for more focused debugging and maintenance.
105+
106+
To collect these metrics, you must configure and collect query metrics. For more information, see [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/).
107+
108+
The following table lists all available metrics.
109+
110+
Field | Description
111+
:--- | :---
112+
`LOCAL_INDEX_READER_PARSING_EXCEPTIONS` | The number of errors that occur when parsing data using the LocalIndexReader.
113+
`LOCAL_INDEX_EXPORTER_BULK_FAILURES` | The number of failures that occur when ingesting Query Insights plugin data into local indexes.
114+
`LOCAL_INDEX_EXPORTER_EXCEPTIONS` | The number of exceptions that occur in the Query Insights plugin LocalIndexExporter.
115+
`INVALID_EXPORTER_TYPE_FAILURES` | The number of invalid exporter type failures.
116+
`INVALID_INDEX_PATTERN_EXCEPTIONS` | The number of invalid index pattern exceptions.
117+
`DATA_INGEST_EXCEPTIONS` | The number of exceptions that occur when ingesting data into the Query Insights plugin.
118+
`QUERY_CATEGORIZE_EXCEPTIONS` | The number of exceptions that occur when categorizing the queries.
119+
`EXPORTER_FAIL_TO_CLOSE_EXCEPTION` | The number of failures that occur when closing the exporter.

_observing-your-data/query-insights/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,3 +42,7 @@ You can obtain the following information using Query Insights:
4242
- [Top n queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/top-n-queries/)
4343
- [Grouping top N queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/grouping-top-n-queries/)
4444
- [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/)
45+
46+
## Query Insights plugin health
47+
48+
For information about monitoring the health of the Query Insights plugin, see [Query Insights plugin health]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/health/).

0 commit comments

Comments
 (0)