Skip to content

Implement periodic writing of alertmanager state to storage. #4031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 9, 2021

Conversation

stevesg
Copy link
Contributor

@stevesg stevesg commented Mar 30, 2021

What this PR does:
When ring-based/sharding replication is enabled, the alertmanager state
(silences, notification log) is periodically written to object storage
so that it can be used to recover from an all-replica outage. Only one
of the replicas is responsible for writing the state (position 0).

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic makes sense to me! I left a couple of nits. Waiting for the final PR to do a deeper review but I haven't seen any issue so far 👏

stevesg added 2 commits April 7, 2021 09:16
When ring-based/sharding replication is enabled, the alertmanager state
(silences, notification log) is periodically written to object storage
so that it can be used to recover from an all-replica outage. Only one
of the replicas is responsible for writing the state (position 0).

Signed-off-by: Steve Simpson <steve.simpson@grafana.com>
Signed-off-by: Steve Simpson <steve.simpson@grafana.com>
@stevesg stevesg marked this pull request as ready for review April 7, 2021 07:34
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! LGTM (modulo a couple of nits)

Copy link
Contributor

@ranton256 ranton256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me, just besides the minor point of the wording discussion about the documentation part.

Thanks!

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice small readable PR!

stevesg added 2 commits April 9, 2021 11:36
Signed-off-by: Steve Simpson <steve.simpson@grafana.com>
Signed-off-by: Steve Simpson <steve.simpson@grafana.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback! 🙏

@pracucci pracucci merged commit f107e5d into cortexproject:master Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants