added alerts using prometheus section

openshift · Jul 17, 2018 · 3fb8935 · 3fb8935
1 parent 51501b5
commit 3fb8935
Show file tree

Hide file tree

Showing 4 changed files with 45 additions and 2 deletions.
diff --git a/day_two_guide/environment_health_checks.adoc b/day_two_guide/environment_health_checks.adoc
@@ -23,6 +23,11 @@ in this section to diagnose any problems.
 
 include::day_two_guide/topics/complete_deployment_health_check.adoc[leveloffset=+2]
 
+[[day-two-guide-creating-alerts-using-prometheus]]
+== Creating alerts using Prometheus
+
+include::day_two_guide/topics/alerts_using_prometheus.adoc[leveloffset=+2]
+
 [[day-two-guide-host-health]]
 == Host health
 

diff --git a/day_two_guide/topics/alerts_using_prometheus.adoc b/day_two_guide/topics/alerts_using_prometheus.adoc
@@ -0,0 +1,32 @@
+////
+Creating alerts using Prometheus
+
+Module included in the following assemblies:
+
+* day_two_guide/environment_health_checks.adoc
+////
+
+You can integrate {product-title} with Prometheus to create visuals and alerts
+to help diagnose any environment issues before they arise. These issues can
+include if a node goes down, if a pod is consuming too much CPU or memory, and
+more.
+
+See the
+xref:../install_config/cluster_metrics.adoc#openshift-prometheus[Prometheus on
+OpenShift Container Platform section in the Installation and configuration
+guide] for more information.
+
+[IMPORTANT]
+====
+Prometheus on {product-title} is a Technology Preview feature only.
+ifdef::openshift-enterprise[]
+Technology Preview features are not supported with Red Hat production service
+level agreements (SLAs), might not be functionally complete, and Red Hat does
+not recommend to use them for production. These features provide early access to
+upcoming product features, enabling customers to test functionality and provide
+feedback during the development process.
+
+For more information on Red Hat Technology Preview features support scope, see
+https://access.redhat.com/support/offerings/techpreview/.
+endif::[]
+====
diff --git a/dev_guide/persistent_volumes.adoc b/dev_guide/persistent_volumes.adoc
@@ -229,3 +229,4 @@ When a PV has its `claimRef` set to some PVC name and namespace, and is
 reclaimed according to a `Retain` or `Recycle` reclaim policy, its `claimRef`
 will remain set to the same PVC name and namespace even if the PVC or the whole
 namespace no longer exists.
+
diff --git a/install_config/cluster_metrics.adoc b/install_config/cluster_metrics.adoc
@@ -969,8 +969,10 @@ additional rules variable:
 openshift_prometheus_additional_rules_file: <PATH>
 ----
 
-The file content should be in Prometheus Alert rules format. The following
-example sets a rule to send an alert when one of the cluster nodes is down:
+The file must follow
+link:https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/[the
+Prometheus Alert rules format]. The following example sets a rule to send an
+alert when one of the cluster nodes is down:
 
 ----
 groups:
@@ -979,11 +981,14 @@ groups:
   rules:
   - alert: Node Down
     expr: up{job="kubernetes-nodes"} == 0
+    for: 10m <1>
     annotations:
       miqTarget: "ContainerNode"
       severity: "HIGH"
       message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
 ----
+<1> The optional `for` value specifies the amount of time Prometheus waits before it
+sends an alert for this element. For example, if you set `10m`, Prometheus waits 10 minutes after it encounters this issue before sending an alert.
 
 *Prometheus Variables to Control Resource Limits*
Original file line number	Diff line number	Diff line change
Expand Up		@@ -229,3 +229,4 @@ When a PV has its `claimRef` set to some PVC name and namespace, and is
		reclaimed according to a `Retain` or `Recycle` reclaim policy, its `claimRef`
		will remain set to the same PVC name and namespace even if the PVC or the whole
		namespace no longer exists.