-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[prometheus-alerts] Import the latest version of the chart from k8s-p…
…ublic-charts (#6)
- Loading branch information
Showing
8 changed files
with
197 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,3 +4,7 @@ | |
# Helm chart automated files | ||
/charts/*/charts | ||
.idea | ||
|
||
# Files for diffing between templates | ||
new | ||
orig |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
## CPUThrottlingHigh | ||
|
||
This alert fires if any particular container is experiencing throttling by the | ||
Linux CFS system. This typically means that your container is operating close | ||
to its Kubernetes `resource.limits` configuration. You can quickly look at the | ||
utilization of the individual containers within a given pod or namespace like | ||
this: | ||
|
||
$ k top pods --containers | ||
POD NAME CPU(cores) MEMORY(bytes) | ||
datadog-agent-2qk9w agent 22m 65Mi | ||
datadog-agent-2qk9w process-agent 10m 35Mi | ||
datadog-agent-2qk9w system-probe 6m 34Mi | ||
datadog-agent-2qk9w trace-agent 2m 27Mi | ||
|
||
You can compare the actual used CPU and Memory values with the pod through the | ||
`kubectl describe pod <pod>` command: | ||
|
||
$ k describe pod datadog-agent-2qk9w | ||
Name: datadog-agent-2qk9w | ||
Namespace: datadog-operator | ||
... | ||
Containers: | ||
agent: | ||
... | ||
Limits: | ||
cpu: 25m | ||
memory: 256Mi | ||
Requests: | ||
cpu: 10m | ||
memory: 96Mi | ||
|
||
In the example above, you can see that the `agent` has a CPU Limit of `25m`, | ||
but its running at `22m`... so its pretty close to its actual limits. It's | ||
resource limits should likely be adjusted. | ||
|
||
## KubeQuotaAlmostFull | ||
|
||
This alert telling you that the resources requested by all of the `Pods` in | ||
your `Namespace` are close to the `Quota` limits that have been assigned. You | ||
can inspect any quotas or limits placed on your `Namespace` like this: | ||
|
||
$ kubectl describe namespace my-namespace | ||
Name: my-namespace | ||
Status: Active | ||
|
||
Resource Quotas | ||
Name: default-quotas | ||
Resource Used Hard | ||
-------- --- --- | ||
limits.cpu 10500m 64 | ||
limits.memory 18816Mi 128Gi | ||
requests.cpu 8500m 64 | ||
requests.memory 16256Mi 128Gi | ||
requests.storage 105Gi 512Gi | ||
|
||
Resource Limits | ||
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio | ||
---- -------- --- --- --------------- ------------- ----------------------- | ||
Container cpu - 8 0 0 - | ||
Container memory - 16Gi 128Mi 128Mi - | ||
|
||
## KubeQuotaFullyUsed | ||
|
||
Similar to the `KubeQuotaAlmostFull` alert - but you are now out of resources. | ||
At this point you cannot launch or scale any new resources until you reduce | ||
your usage, or work with an administrator to expand your `Quota` capacity. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
61 changes: 61 additions & 0 deletions
61
charts/prometheus-alerts/templates/namespace-prometheusrule.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
{{ $values := .Values }} | ||
{{ $targetNamespace := .Release.Namespace }} | ||
{{ if .Values.namespaceRules.enabled }} | ||
# Largely copied from | ||
# https://github.com/prometheus-community/helm-charts/blob/kube-prometheus-stack-14.6.2/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kubernetes-resources.yaml, | ||
# but more customizable. | ||
# | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: PrometheusRule | ||
metadata: | ||
name: {{ .Release.Name }}-namespace-rules | ||
annotations: | ||
nextdoor.com/chart: {{ .Values.chart_name }} | ||
nextdoor.com/source: {{ .Values.chart_source }} | ||
spec: | ||
groups: | ||
- name: {{ .Release.Name }}.{{ .Release.Namespace }}.namespaceRules | ||
rules: | ||
|
||
{{- with .Values.namespaceRules.KubeQuotaAlmostFull }} | ||
- alert: KubeQuotaAlmostFull | ||
annotations: | ||
summary: Namespace quota is going to be full. | ||
runbook_url: {{ $values.defaults.runbookUrl }}#KubeQuotaAlmostFull | ||
description: >- | ||
{{`Namespace {{ $labels.namespace }} is using {{ $value }} of its {{ $labels.resource }} quota. `}} | ||
expr: |- | ||
( | ||
kube_resourcequota{job="kube-state-metrics", type="used", namespace=~"{{ $targetNamespace }}"} | ||
/ ignoring(instance, job, type) | ||
(kube_resourcequota{job="kube-state-metrics", type="hard", namespace=~"{{ $targetNamespace }}"} > 0) | ||
) * 100 > {{ .threshold }} < 100 | ||
for: {{ .for }} | ||
labels: | ||
severity: {{ .severity }} | ||
{{- if $values.defaults.additionalRuleLabels }} | ||
{{ toYaml $values.defaults.additionalRuleLabels | nindent 8 }} | ||
{{- end }} | ||
{{- end }} | ||
|
||
{{- with .Values.namespaceRules.KubeQuotaFullyUsed }} | ||
- alert: KubeQuotaFullyUsed | ||
annotations: | ||
summary: Namespace quota is fully used. | ||
description: >- | ||
{{`Namespace {{ $labels.namespace }} is using {{ $value }} of its {{ $labels.resource }} quota.`}} | ||
runbook_url: {{ $values.defaults.runbookUrl }}#KubeQuotaFullyUsed | ||
expr: |- | ||
( | ||
kube_resourcequota{job="kube-state-metrics", type="used", namespace=~"{{ $targetNamespace }}"} | ||
/ ignoring(instance, job, type) | ||
(kube_resourcequota{job="kube-state-metrics", type="hard", namespace=~"{{ $targetNamespace }}"} > 0) | ||
) * 100 >= {{ .threshold }} | ||
for: {{ .for }} | ||
labels: | ||
severity: {{ .severity }} | ||
{{- if $values.defaults.additionalRuleLabels }} | ||
{{ toYaml $values.defaults.additionalRuleLabels | nindent 8 }} | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters