|
| 1 | +[role="xpack"] |
| 2 | +[[alerting-getting-started]] |
| 3 | += Alerting and Actions |
| 4 | + |
| 5 | +beta[] |
| 6 | + |
| 7 | +-- |
| 8 | + |
| 9 | +Alerting allows you to detect complex conditions within different {kib} apps and trigger actions when those conditions are met. Alerting is integrated with <<xpack-apm,*APM*>>, <<xpack-infra,*Metrics*>>, <<xpack-siem,*SIEM*>>, <<xpack-uptime,*Uptime*>>, can be centrally managed from the <<management,*Management*>> UI, and provides a set of built-in <<action-types, actions>> and <<alert-types, alerts>> for you to use. |
| 10 | + |
| 11 | +image::images/alerting-overview.png[Alerts and actions UI] |
| 12 | + |
| 13 | +[IMPORTANT] |
| 14 | +============================================== |
| 15 | +To make sure you can access alerting and actions, see the <<alerting-setup-prerequisites, setup and pre-requisites>> section. |
| 16 | +============================================== |
| 17 | + |
| 18 | +[float] |
| 19 | +== Concepts and terminology |
| 20 | + |
| 21 | +*Alerts* work by running checks on a schedule to detect conditions. When a condition is met, the alert tracks it as an *alert instance* and responds by triggering one or more *actions*. |
| 22 | +Actions typically involve interaction with {kib} services or third party integrations. *Connectors* allow actions to talk to these services and integrations. |
| 23 | +This section describes all of these elements and how they operate together. |
| 24 | + |
| 25 | +[float] |
| 26 | +=== What is an alert? |
| 27 | + |
| 28 | +An alert specifies a background task that runs on the {kib} server to check for specific conditions. It consists of three main parts: |
| 29 | + |
| 30 | +* *Conditions*: what needs to be detected? |
| 31 | +* *Schedule*: when/how often should detection checks run? |
| 32 | +* *Actions*: what happens when a condition is detected? |
| 33 | + |
| 34 | +For example, when monitoring a set of servers, an alert might check for average CPU usage > 0.9 on each server for the two minutes (condition), checked every minute (schedule), sending a warning email message via SMTP with subject `CPU on {{server}} is high` (action). |
| 35 | + |
| 36 | +image::images/what-is-an-alert.svg[Three components of an alert] |
| 37 | + |
| 38 | +The following sections each part of the alert is described in more detail. |
| 39 | + |
| 40 | +[float] |
| 41 | +[[alerting-concepts-conditions]] |
| 42 | +==== Conditions |
| 43 | + |
| 44 | +Under the hood, {kib} alerts detect conditions by running javascript function on the {kib} server, which gives it flexibility to support a wide range of detections, anything from the results of a simple {es} query to heavy computations involving data from multiple sources or external systems. |
| 45 | + |
| 46 | +These detections are packaged and exposed as *alert types*. An alert type hides the underlying details of the detection, and exposes a set of parameters |
| 47 | +to control the details of the conditions to detect. |
| 48 | + |
| 49 | +For example, an <<alert-types, index threshold alert type>> lets you specify the index to query, an aggregation field, and a time window, but the details of the underlying {es} query are hidden. |
| 50 | + |
| 51 | +See <<alert-types>> for the types of alerts provided by {kib} and how they express their conditions. |
| 52 | + |
| 53 | +[float] |
| 54 | +[[alerting-concepts-scheduling]] |
| 55 | +==== Schedule |
| 56 | + |
| 57 | +Alert schedules are defined as an interval between subsequent checks, and can range from a few seconds to months. |
| 58 | + |
| 59 | +[IMPORTANT] |
| 60 | +============================================== |
| 61 | +The intervals of alert checks in {kib} are approximate, their timing of their execution is affected by factors such as the frequency at which tasks are claimed and the task load on the system. See <<alerting-scale-performance>> for more information. |
| 62 | +============================================== |
| 63 | + |
| 64 | +[float] |
| 65 | +[[alerting-concepts-actions]] |
| 66 | +==== Actions |
| 67 | + |
| 68 | +Actions are invocations of {kib} services or integrations with third-party systems, that run as background tasks on the {kib} server when alert conditions are met. |
| 69 | + |
| 70 | +When defining actions in an alert, you specify: |
| 71 | + |
| 72 | +* the *action type*: the type of service or integration to use |
| 73 | +* the connection for that type by referencing a <<alerting-concepts-connectors, connector>> |
| 74 | +* a mapping of alert values to properties exposed for that type of action |
| 75 | + |
| 76 | +The result is a template: all the parameters needed to invoke a service are supplied except for specific values that are only known at the time the alert condition is detected. |
| 77 | + |
| 78 | +In the server monitoring example, the `email` action type is used, and `server` is mapped to the body of the email, using the template string `CPU on {{server}} is high`. |
| 79 | + |
| 80 | +When the alert detects the condition, it creates an <<alerting-concepts-alert-instances, alert instance>> containing the details of the condition, renders the template with these details such as server name, and executes the action on the {kib} server by invoking the `email` action type. |
| 81 | + |
| 82 | +image::images/what-is-an-action.svg[Actions are like templates that are rendered when an alert detects a condition] |
| 83 | + |
| 84 | +See <<action-types>> for details on the types of actions provided by {kib}. |
| 85 | + |
| 86 | +[float] |
| 87 | +[[alerting-concepts-alert-instances]] |
| 88 | +=== Alert instances |
| 89 | + |
| 90 | +When checking for a condition, an alert might identify multiple occurrences of the condition. {kib} tracks each of these *alert instances* separately and takes action per instance. |
| 91 | + |
| 92 | +Using the server monitoring example, each server with average CPU > 0.9 is tracked as an alert instance. This means a separate email is sent for each server that exceeds the threshold. |
| 93 | + |
| 94 | +image::images/alert-instances.svg[{kib} tracks each detected condition as an alert instance and takes action on each instance] |
| 95 | + |
| 96 | +[float] |
| 97 | +[[alerting-concepts-suppressing-duplicate-notifications]] |
| 98 | +=== Suppressing duplicate notifications |
| 99 | + |
| 100 | +Since actions are taken per instance, alerts can end up generating a large number of actions. Take the following example where an alert is monitoring three servers every minute for CPU usage > 0.9: |
| 101 | + |
| 102 | +* Minute 1: server X123 > 0.9. *One email* is sent for server X123. |
| 103 | +* Minute 2: X123 and Y456 > 0.9. *Two emails* are sent, on for X123 and one for Y456. |
| 104 | +* Minute 3: X123, Y456, Z789 > 0.9. *Three emails* are sent, one for each of X123, Y456, Z789. |
| 105 | + |
| 106 | +In the above example, three emails are sent for server X123 in the span of 3 minutes for the same condition. Often it's desirable to suppress frequent re-notification. Operations like muting and re-notification throttling can be applied at the instance level. If we set the alert re-notify interval to 5 minutes, we reduce noise by only getting emails for new servers that exceed the threshold: |
| 107 | + |
| 108 | +* Minute 1: server X123 > 0.9. *One email* is sent for server X123. |
| 109 | +* Minute 2: X123 and Y456 > 0.9. *One email* is sent for Y456 |
| 110 | +* Minute 3: X123, Y456, Z789 > 0.9. *One email* is sent for Z789. |
| 111 | + |
| 112 | +[float] |
| 113 | +[[alerting-concepts-connectors]] |
| 114 | +=== Connectors |
| 115 | + |
| 116 | +Actions often involve connecting with services inside {kib} or integrations with third-party systems. |
| 117 | +Rather than repeatedly entering connection information and credentials for each action, {kib} simplifies action setup using *connectors*. |
| 118 | + |
| 119 | +*Connectors* provide a central place to store connection information for services and integrations. For example if four alerts send email notifications via the same SMTP service, |
| 120 | +they all reference the same SMTP connector. When the SMTP settings change they are updated once in the connector, instead of having to update four alerts. |
| 121 | + |
| 122 | +image::images/alert-concepts-connectors.svg[Connectors provide a central place to store service connection settings] |
| 123 | + |
| 124 | +[float] |
| 125 | +=== Summary |
| 126 | + |
| 127 | +An _alert_ consists of conditions, _actions_, and a schedule. When conditions are met, _alert instances_ are created that render _actions_ and invoke them. To make action setup and update easier, actions refer to _connectors_ that centralize the information used to connect with {kib} services and third-party integrations. |
| 128 | + |
| 129 | +image::images/alert-concepts-summary.svg[Alerts, actions, alert instances and connectors work together to convert detection into action] |
| 130 | + |
| 131 | +* *Alert*: a specification of the conditions to be detected, the schedule for detection, and the response when detection occurs. |
| 132 | +* *Action*: the response to a detected condition defined in the alert. Typically actions specify a service or third party integration along with alert details that will be sent to it. |
| 133 | +* *Alert instance*: state tracked by {kib} for every occurrence of a detected condition. Actions as well as controls like muting and re-notification are controlled at the instance level. |
| 134 | +* *Connector*: centralized configurations for services and third party integration that are referenced by actions. |
| 135 | + |
| 136 | +[float] |
| 137 | +[[alerting-concepts-differences]] |
| 138 | +== Differences from Watcher |
| 139 | + |
| 140 | +{kib} alerting and <<watcher-ui, {es} alerting>> are both used to detect conditions and can trigger actions in response, but they are completely independent alerting systems. |
| 141 | + |
| 142 | +This section will clarify some of the important differences in the function and intent of the two systems. |
| 143 | + |
| 144 | +Functionally, {kib} alerting differs in that: |
| 145 | + |
| 146 | +* Scheduled checks are run on {kib} instead of {es} |
| 147 | +* {kib} <<alerting-concepts-conditions, alerts hide the details of detecting conditions>> through *alert types*, whereas watches provide low-level control over inputs, conditions, and transformations. |
| 148 | +* {kib} alerts tracks and persists the state of each detected condition through *alert instances*. This makes it possible to mute and throttle individual instances, and detect changes in state such as resolution. |
| 149 | +* Actions are linked to *alert instances* in {kib} alerting. Actions are fired for each occurrence of a detected condition, rather than for the entire alert. |
| 150 | + |
| 151 | +At a higher level, {kib} alerts allow rich integrations across use cases like <<xpack-apm,*APM*>>, <<xpack-infra,*Metrics*>>, <<xpack-siem,*SIEM*>>, and <<xpack-uptime,*Uptime*>>. |
| 152 | +Pre-packaged *alert types* simplify setup, hide the details complex domain-specific detections, while providing a consistent interface across {kib}. |
| 153 | + |
| 154 | +[float] |
| 155 | +[[alerting-setup-prerequisites]] |
| 156 | +== Setup and prerequisites |
| 157 | + |
| 158 | +If you are using an *on-premises* Elastic Stack deployment: |
| 159 | + |
| 160 | +* In the kibana.yml configuration file, add the <<alert-action-settings-kb,`xpack.encryptedSavedObjects.encryptionKey`>> setting. |
| 161 | + |
| 162 | +If you are using an *on-premises* Elastic Stack deployment with <<using-kibana-with-security, *security*>>: |
| 163 | + |
| 164 | +* You must enable Transport Layer Security (TLS) for communication <<configuring-tls-kib-es, between {es} and {kib}>>. {kib} alerting uses <<api-keys, API keys>> to secure background alert checks and actions, and API keys require {ref}/configuring-tls.html#tls-http[TLS on the HTTP interface]. A proxy will not suffice. |
| 165 | + |
| 166 | +[float] |
| 167 | +[[alerting-security]] |
| 168 | +== Security |
| 169 | + |
| 170 | +To access alerting in a space, a user must have access to one of the following features: |
| 171 | + |
| 172 | +* <<xpack-apm,*APM*>> |
| 173 | +* <<xpack-infra,*Metrics*>> |
| 174 | +* <<xpack-siem,*SIEM*>> |
| 175 | +* <<xpack-uptime,*Uptime*>> |
| 176 | + |
| 177 | +See <<kibana-feature-privileges, feature privileges>> for more information on configuring roles that provide access to these features. |
| 178 | + |
| 179 | +[float] |
| 180 | +[[alerting-spaces]] |
| 181 | +=== Space isolation |
| 182 | + |
| 183 | +Alerts and connectors are isolated to the {kib} space in which they were created. An alert or connector created in one space will not be visible in another. |
| 184 | + |
| 185 | +[float] |
| 186 | +[[alerting-authorization]] |
| 187 | +=== Authorization |
| 188 | + |
| 189 | +Alerts, including all background detection and the actions they generate are authorized using an <<api-keys, API key>> associated with the last user to edit the alert. Upon creating or modifying an alert, an API key is generated for that user, capturing a snapshot of their privileges at that moment in time. The API key is then used to run all background tasks associated with the alert including detection checks and executing actions. |
| 190 | + |
| 191 | +[IMPORTANT] |
| 192 | +============================================== |
| 193 | +If an alert requires certain privileges to run such as index privileges, keep in mind that if a user without those privileges updates the alert, the alert will no longer function. |
| 194 | +============================================== |
| 195 | + |
| 196 | +[float] |
| 197 | +[[alerting-restricting-actions]] |
| 198 | +=== Restricting actions |
| 199 | + |
| 200 | +For security reasons you may wish to limit the extent to which {kib} can connect to external services. <<action-settings>> allows you to disable certain <<action-types>> and whitelist the hostnames that {kib} can connect with. |
| 201 | + |
| 202 | +-- |
0 commit comments