-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added "Alerts using prometheus" #10199
Conversation
cc: @vikram-redhat |
fbc4d13
to
fb372d2
Compare
Looks like there will be much more Prometheus stuff for 3.11, so this can do for now. @openshift/team-documentation PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picks, comments, suggestions. Overall, it LGTM
(I assume that you want to update all the modules in this assembly to follow the mod docs templates at one time, so I've left out a couple of comments about the templates.)
* day_two_guide/environment_health_checks.adoc | ||
//// | ||
|
||
While the topics in this section are for manually checking the health of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an eye towards reuse, I'd remove or conditionalize "While the topics in this section are for manually checking the health of an {product-title} component, "
While the topics in this section are for manually checking the health of an | ||
{product-title} component, you can integrate {product-title} with Prometheus to | ||
create visuals and alerts to help diagnose any environment issues before they | ||
arise. These issues can include if a node goes down, if a pod is consuming too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might say, "For example, you can monitor if nodes go down or if pods consume too many resources, such as CPU or memory."
arise. These issues can include if a node goes down, if a pod is consuming too | ||
much CPU or memory, and more. | ||
|
||
See the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that topic were in modules, this would be a great place for an include instead of an xref.
install_config/cluster_metrics.adoc
Outdated
@@ -969,7 +969,7 @@ additional rules variable: | |||
openshift_prometheus_additional_rules_file: <PATH> | |||
---- | |||
|
|||
The file content should be in Prometheus Alert rules format. The following | |||
The file content should be link:https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/[in Prometheus Alert rules format]. The following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ditch the modal, I'd say "This file must follow the xref:[Prometheus Alert rules format]."
install_config/cluster_metrics.adoc
Outdated
annotations: | ||
miqTarget: "ContainerNode" | ||
severity: "HIGH" | ||
message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down" | ||
---- | ||
<1> The optional `for` value specifies the amount of time Prometheus waits before it | ||
sends an alert for this element. For example, if setting `10m`, Prometheus will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/setting/you set
s/will wait/waits
install_config/cluster_metrics.adoc
Outdated
annotations: | ||
miqTarget: "ContainerNode" | ||
severity: "HIGH" | ||
message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down" | ||
---- | ||
<1> The optional `for` value specifies the amount of time Prometheus waits before it | ||
sends an alert for this element. For example, if setting `10m`, Prometheus will | ||
wait for 10 minutes when encountering this issue before sending an alert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/when encountering/after it encounters
s/sending/it sends
fb372d2
to
3fb8935
Compare
Thanks @kalexand-rh . Merging. |
/cherrypick enterprise-3.10 |
@bfallonf: new pull request created: #10843 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherrypick enterprise-3.9 |
@bfallonf: new pull request created: #10844 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherrypick enterprise-3.7 |
@bfallonf: new pull request created: #10845 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Part of more day two guide efforts.
Will assess if there's more to come.