Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example of deadmanssnitch setup. #237

Merged
merged 4 commits into from
Feb 25, 2019
Merged

Add example of deadmanssnitch setup. #237

merged 4 commits into from
Feb 25, 2019

Conversation

gswallow
Copy link

@gswallow gswallow commented Feb 7, 2019

No description provided.

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 7, 2019
@openshift-ci-robot
Copy link
Contributor

Hi @gswallow. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@gswallow: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome. Thank you!


Configure Dead Man's Snitch to page the operator if the Dead man's switch alert is silent for 15 minutes. With the default Alertmanager configuration, the Dead man's switch alert is repeated every five minutes. If Dead Man's Snitch triggers after 15 minutes, it indicates that the notification has been unsuccessful at least twice.
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a yaml tag

@@ -19,6 +19,8 @@ route:
receivers:
- name: default
- name: deadmansswitch
webhook_configs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is used for templating the section right before the DeadMansSnitch configuration section. Could you just create two versions of the files: one with, and one without the DeadMansSnitch configuration. Then we can embed both of them in the documentation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you like to call it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just call it the same but with -deadmanssnitch.yaml suffix

@openshift-ci-robot
Copy link
Contributor

@gswallow: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

What would you like to call it?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@gswallow: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

- url: "https://nosnch.in/XXXXXX
```

Configure a Dead Man's Snitch integration with PagerDuty, along with an excalation on PagerDuty to page the operator if the Dead man's switch alert is silent for 15 minutes. With the default Alertmanager configuration, the Dead man's switch alert is repeated every five minutes. If Dead Man's Snitch triggers after 15 minutes, it indicates that the notification has been unsuccessful at least twice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Configure a Dead Man's Snitch integration with PagerDuty, along with an excalation on PagerDuty to page the operator if the Dead man's switch alert is silent for 15 minutes. With the default Alertmanager configuration, the Dead man's switch alert is repeated every five minutes. If Dead Man's Snitch triggers after 15 minutes, it indicates that the notification has been unsuccessful at least twice.
Configure a Dead Man's Snitch integration with PagerDuty, along with an escalation on PagerDuty to page the operator if the Dead man's switch alert is silent for 15 minutes. With the default Alertmanager configuration, the Dead man's switch alert is repeated every five minutes. If Dead Man's Snitch triggers after 15 minutes, it indicates that the notification has been unsuccessful at least twice.

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 9, 2019
@brancz
Copy link
Contributor

brancz commented Feb 18, 2019

Thanks a lot for adding this!

/ok-to-test
/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 18, 2019
@s-urbaniak
Copy link
Contributor

/lgtm

@s-urbaniak
Copy link
Contributor

/refresh

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, gswallow, s-urbaniak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2019
@openshift-merge-robot openshift-merge-robot merged commit a447490 into openshift:master Feb 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants