Skip to content

Commit

Permalink
feat: variabilize (cron)jobs groups (#59)
Browse files Browse the repository at this point in the history
* feat: variabilize (cron)jobs groups

* doc: update readme

---------

Co-authored-by: Florent DELAHAYE <florent.delahaye@fr.clara.net>
  • Loading branch information
Tulux and Florent DELAHAYE authored Sep 19, 2024
1 parent 94d1ea0 commit c6d6293
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 4 deletions.
2 changes: 2 additions & 0 deletions caas/kubernetes/workload/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ Creates DataDog monitors with the following checks:
| <a name="input_cronjob_extra_tags"></a> [cronjob\_extra\_tags](#input\_cronjob\_extra\_tags) | Extra tags for Cronjob monitor | `list(string)` | `[]` | no |
| <a name="input_cronjob_message"></a> [cronjob\_message](#input\_cronjob\_message) | Custom message for Cronjob monitor | `string` | `""` | no |
| <a name="input_cronjob_threshold_warning"></a> [cronjob\_threshold\_warning](#input\_cronjob\_threshold\_warning) | Cronjob monitor (warning threshold) | `string` | `3` | no |
| <a name="input_cronjobfailed_group_by"></a> [cronjobfailed\_group\_by](#input\_cronjobfailed\_group\_by) | n/a | `list` | <pre>[<br> "kube_cronjob"<br>]</pre> | no |
| <a name="input_deployment_group_by"></a> [deployment\_group\_by](#input\_deployment\_group\_by) | Select group by element on deployment monitors | `list` | <pre>[<br> "kube_namespace",<br> "kube_deployment",<br> "kube_cluster_name"<br>]</pre> | no |
| <a name="input_environment"></a> [environment](#input\_environment) | Architecture Environment | `string` | n/a | yes |
| <a name="input_evaluation_delay"></a> [evaluation\_delay](#input\_evaluation\_delay) | Delay in seconds for the metric evaluation | `number` | `15` | no |
Expand All @@ -72,6 +73,7 @@ Creates DataDog monitors with the following checks:
| <a name="input_job_extra_tags"></a> [job\_extra\_tags](#input\_job\_extra\_tags) | Extra tags for Job monitor | `list(string)` | `[]` | no |
| <a name="input_job_message"></a> [job\_message](#input\_job\_message) | Custom message for Job monitor | `string` | `""` | no |
| <a name="input_job_threshold_warning"></a> [job\_threshold\_warning](#input\_job\_threshold\_warning) | Job monitor (warning threshold) | `string` | `3` | no |
| <a name="input_jobfailed_group_by"></a> [jobfailed\_group\_by](#input\_jobfailed\_group\_by) | n/a | `list` | <pre>[<br> "kube_job",<br> "kube_cluster_name"<br>]</pre> | no |
| <a name="input_message"></a> [message](#input\_message) | Message sent when a monitor is triggered | `any` | n/a | yes |
| <a name="input_new_group_delay"></a> [new\_group\_delay](#input\_new\_group\_delay) | Delay in seconds before monitor new resource | `number` | `300` | no |
| <a name="input_new_host_delay"></a> [new\_host\_delay](#input\_new\_host\_delay) | Delay in seconds before monitor new resource | `number` | `300` | no |
Expand Down
8 changes: 8 additions & 0 deletions caas/kubernetes/workload/inputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -223,3 +223,11 @@ variable "deployment_group_by" {
default = ["kube_namespace", "kube_deployment", "kube_cluster_name"]
description = "Select group by element on deployment monitors"
}

variable "jobfailed_group_by" {
default = ["kube_job", "kube_cluster_name"]
}

variable "cronjobfailed_group_by" {
default = ["kube_cronjob"]
}
6 changes: 4 additions & 2 deletions caas/kubernetes/workload/locals.tf
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
locals {
replica_group_by = join(", ", var.replica_group_by)
deployment_group_by = join(", ", var.deployment_group_by)
replica_group_by = join(", ", var.replica_group_by)
deployment_group_by = join(", ", var.deployment_group_by)
jobfailed_group_by = join(", ", [for i in var.jobfailed_group_by : format("%q", i)])
cronjobfailed_group_by = join(", ", [for i in var.cronjobfailed_group_by : format("%q", i)])
}
4 changes: 2 additions & 2 deletions caas/kubernetes/workload/monitors-k8s-workload.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ resource "datadog_monitor" "job" {
type = "service check"

query = <<EOQ
"kubernetes_state.job.complete"${module.filter-tags.service_check}.by("kube_job", "kube_cluster_name").last(6).count_by_status()
"kubernetes_state.job.complete"${module.filter-tags.service_check}.by(${local.jobfailed_group_by}).last(6).count_by_status()
EOQ

monitor_thresholds {
Expand All @@ -32,7 +32,7 @@ resource "datadog_monitor" "cronjob" {
type = "service check"

query = <<EOQ
"kubernetes_state.cronjob.on_schedule_check"${module.filter-tags.service_check}.by("kube_cronjob").last(6).count_by_status()
"kubernetes_state.cronjob.on_schedule_check"${module.filter-tags.service_check}.by(${local.cronjobfailed_group_by}).last(6).count_by_status()
EOQ

monitor_thresholds {
Expand Down

0 comments on commit c6d6293

Please sign in to comment.