Added "Alerts using prometheus" #10199

bfallonf · 2018-06-19T01:15:49Z

Part of more day two guide efforts.

Will assess if there's more to come.

bfallonf · 2018-06-28T01:31:08Z

cc: @vikram-redhat

bfallonf · 2018-07-16T01:30:26Z

Looks like there will be much more Prometheus stuff for 3.11, so this can do for now.

@openshift/team-documentation PTAL

kalexand-rh

Picks, comments, suggestions. Overall, it LGTM

(I assume that you want to update all the modules in this assembly to follow the mod docs templates at one time, so I've left out a couple of comments about the templates.)

kalexand-rh · 2018-07-16T14:26:30Z

day_two_guide/topics/alerts_using_prometheus.adoc

+* day_two_guide/environment_health_checks.adoc
+////
+
+While the topics in this section are for manually checking the health of an


With an eye towards reuse, I'd remove or conditionalize "While the topics in this section are for manually checking the health of an {product-title} component, "

kalexand-rh · 2018-07-16T14:28:12Z

day_two_guide/topics/alerts_using_prometheus.adoc

+While the topics in this section are for manually checking the health of an
+{product-title} component, you can integrate {product-title} with Prometheus to
+create visuals and alerts to help diagnose any environment issues before they
+arise. These issues can include if a node goes down, if a pod is consuming too


I might say, "For example, you can monitor if nodes go down or if pods consume too many resources, such as CPU or memory."

kalexand-rh · 2018-07-16T14:30:02Z

day_two_guide/topics/alerts_using_prometheus.adoc

+arise. These issues can include if a node goes down, if a pod is consuming too
+much CPU or memory, and more.
+
+See the


If that topic were in modules, this would be a great place for an include instead of an xref.

kalexand-rh · 2018-07-16T14:31:17Z

install_config/cluster_metrics.adoc

@@ -969,7 +969,7 @@ additional rules variable:
 openshift_prometheus_additional_rules_file: <PATH>
 ----

-The file content should be in Prometheus Alert rules format. The following
+The file content should be link:https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/[in Prometheus Alert rules format]. The following


To ditch the modal, I'd say "This file must follow the xref:[Prometheus Alert rules format]."

kalexand-rh · 2018-07-16T14:31:45Z

install_config/cluster_metrics.adoc

    annotations:
      miqTarget: "ContainerNode"
      severity: "HIGH"
      message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
 ----
+<1> The optional `for` value specifies the amount of time Prometheus waits before it
+sends an alert for this element. For example, if setting `10m`, Prometheus will


s/setting/you set
s/will wait/waits

kalexand-rh · 2018-07-16T14:32:16Z

install_config/cluster_metrics.adoc

    annotations:
      miqTarget: "ContainerNode"
      severity: "HIGH"
      message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
 ----
+<1> The optional `for` value specifies the amount of time Prometheus waits before it
+sends an alert for this element. For example, if setting `10m`, Prometheus will
+wait for 10 minutes when encountering this issue before sending an alert.


s/when encountering/after it encounters
s/sending/it sends

bfallonf · 2018-07-17T01:48:25Z

Thanks @kalexand-rh . Merging.

bfallonf · 2018-07-17T01:48:43Z

/cherrypick enterprise-3.10

openshift-cherrypick-robot · 2018-07-17T01:48:54Z

@bfallonf: new pull request created: #10843

In response to this:

/cherrypick enterprise-3.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bfallonf · 2018-07-17T01:49:14Z

/cherrypick enterprise-3.9

openshift-cherrypick-robot · 2018-07-17T01:49:21Z

@bfallonf: new pull request created: #10844

In response to this:

/cherrypick enterprise-3.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bfallonf · 2018-07-17T01:49:28Z

/cherrypick enterprise-3.7

openshift-cherrypick-robot · 2018-07-17T01:49:35Z

@bfallonf: new pull request created: #10845

In response to this:

/cherrypick enterprise-3.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 19, 2018

bfallonf changed the title ~~Added "Alerts using prometheus" and "Backups" to project and pvc topics~~ Added "Alerts using prometheus" Jul 9, 2018

bfallonf force-pushed the feedback_day2_alerts branch 3 times, most recently from fbc4d13 to fb372d2 Compare July 16, 2018 01:27

bfallonf added peer-review-needed Signifies that the peer review team needs to review this PR branch/enterprise-3.7 branch/enterprise-3.9 branch/enterprise-3.10 labels Jul 16, 2018

bfallonf added this to the Next Release milestone Jul 16, 2018

kalexand-rh reviewed Jul 16, 2018

View reviewed changes

kalexand-rh added peer-review-done Signifies that the peer review team has reviewed this PR and removed peer-review-needed Signifies that the peer review team needs to review this PR labels Jul 16, 2018

added alerts using prometheus section

3fb8935

bfallonf force-pushed the feedback_day2_alerts branch from fb372d2 to 3fb8935 Compare July 17, 2018 01:32

bfallonf merged commit bc5ae6d into openshift:master Jul 17, 2018

openshift-cherrypick-robot mentioned this pull request Jul 17, 2018

[enterprise-3.10] Added "Alerts using prometheus" #10843

Merged

openshift-cherrypick-robot mentioned this pull request Jul 17, 2018

[enterprise-3.9] Added "Alerts using prometheus" #10844

Merged

openshift-cherrypick-robot mentioned this pull request Jul 17, 2018

[enterprise-3.7] Added "Alerts using prometheus" #10845

Merged

bfallonf deleted the feedback_day2_alerts branch July 17, 2018 01:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added "Alerts using prometheus" #10199

Added "Alerts using prometheus" #10199

bfallonf commented Jun 19, 2018

bfallonf commented Jun 28, 2018

bfallonf commented Jul 16, 2018

kalexand-rh left a comment

kalexand-rh Jul 16, 2018

kalexand-rh Jul 16, 2018

kalexand-rh Jul 16, 2018

kalexand-rh Jul 16, 2018

kalexand-rh Jul 16, 2018

kalexand-rh Jul 16, 2018

bfallonf commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018

Added "Alerts using prometheus" #10199

Added "Alerts using prometheus" #10199

Conversation

bfallonf commented Jun 19, 2018

bfallonf commented Jun 28, 2018

bfallonf commented Jul 16, 2018

kalexand-rh left a comment

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

kalexand-rh Jul 16, 2018

Choose a reason for hiding this comment

bfallonf commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018

bfallonf commented Jul 17, 2018

openshift-cherrypick-robot commented Jul 17, 2018