Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added "Alerts using prometheus" #10199

Merged
merged 1 commit into from
Jul 17, 2018

Conversation

bfallonf
Copy link

Part of more day two guide efforts.

Will assess if there's more to come.

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 19, 2018
@bfallonf
Copy link
Author

cc: @vikram-redhat

@bfallonf bfallonf changed the title Added "Alerts using prometheus" and "Backups" to project and pvc topics Added "Alerts using prometheus" Jul 9, 2018
@bfallonf bfallonf force-pushed the feedback_day2_alerts branch 3 times, most recently from fbc4d13 to fb372d2 Compare July 16, 2018 01:27
@bfallonf bfallonf added this to the Next Release milestone Jul 16, 2018
@bfallonf
Copy link
Author

Looks like there will be much more Prometheus stuff for 3.11, so this can do for now.

@openshift/team-documentation PTAL

Copy link
Contributor

@kalexand-rh kalexand-rh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picks, comments, suggestions. Overall, it LGTM

(I assume that you want to update all the modules in this assembly to follow the mod docs templates at one time, so I've left out a couple of comments about the templates.)

* day_two_guide/environment_health_checks.adoc
////

While the topics in this section are for manually checking the health of an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an eye towards reuse, I'd remove or conditionalize "While the topics in this section are for manually checking the health of an {product-title} component, "

While the topics in this section are for manually checking the health of an
{product-title} component, you can integrate {product-title} with Prometheus to
create visuals and alerts to help diagnose any environment issues before they
arise. These issues can include if a node goes down, if a pod is consuming too
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might say, "For example, you can monitor if nodes go down or if pods consume too many resources, such as CPU or memory."

arise. These issues can include if a node goes down, if a pod is consuming too
much CPU or memory, and more.

See the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that topic were in modules, this would be a great place for an include instead of an xref.

@@ -969,7 +969,7 @@ additional rules variable:
openshift_prometheus_additional_rules_file: <PATH>
----

The file content should be in Prometheus Alert rules format. The following
The file content should be link:https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/[in Prometheus Alert rules format]. The following
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ditch the modal, I'd say "This file must follow the xref:[Prometheus Alert rules format]."

annotations:
miqTarget: "ContainerNode"
severity: "HIGH"
message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
----
<1> The optional `for` value specifies the amount of time Prometheus waits before it
sends an alert for this element. For example, if setting `10m`, Prometheus will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/setting/you set
s/will wait/waits

annotations:
miqTarget: "ContainerNode"
severity: "HIGH"
message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
----
<1> The optional `for` value specifies the amount of time Prometheus waits before it
sends an alert for this element. For example, if setting `10m`, Prometheus will
wait for 10 minutes when encountering this issue before sending an alert.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/when encountering/after it encounters
s/sending/it sends

@kalexand-rh kalexand-rh added peer-review-done Signifies that the peer review team has reviewed this PR and removed peer-review-needed Signifies that the peer review team needs to review this PR labels Jul 16, 2018
@bfallonf bfallonf force-pushed the feedback_day2_alerts branch from fb372d2 to 3fb8935 Compare July 17, 2018 01:32
@bfallonf
Copy link
Author

Thanks @kalexand-rh . Merging.

@bfallonf bfallonf merged commit bc5ae6d into openshift:master Jul 17, 2018
@bfallonf
Copy link
Author

/cherrypick enterprise-3.10

@openshift-cherrypick-robot

@bfallonf: new pull request created: #10843

In response to this:

/cherrypick enterprise-3.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bfallonf
Copy link
Author

/cherrypick enterprise-3.9

@openshift-cherrypick-robot

@bfallonf: new pull request created: #10844

In response to this:

/cherrypick enterprise-3.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bfallonf
Copy link
Author

/cherrypick enterprise-3.7

@openshift-cherrypick-robot

@bfallonf: new pull request created: #10845

In response to this:

/cherrypick enterprise-3.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/enterprise-3.7 branch/enterprise-3.9 branch/enterprise-3.10 peer-review-done Signifies that the peer review team has reviewed this PR size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants