docs: add comprehensive troubleshooting section to README #4711

ABHISHEK-DBZ · 2025-11-07T16:17:41Z

Add troubleshooting section with common issues and solutions
Include cluster connectivity problems and DNS resolution timeouts
Add guidance for alerts/notifications not working
Include memory usage and configuration reload issues
Provide practical examples and commands for debugging

This helps users quickly resolve common operational issues without needing to search through multiple documentation sources.

- Add troubleshooting section with common issues and solutions - Include cluster connectivity problems and DNS resolution timeouts - Add guidance for alerts/notifications not working - Include memory usage and configuration reload issues - Provide practical examples and commands for debugging This helps users quickly resolve common operational issues without needing to search through multiple documentation sources. Signed-off-by: abhishek-dbz <abhibro936@gmail.com>

In 92ecf8b silence_bench_test.go was left behind since it's not run automatically, and started failing. Fix by passing a new registry when creating Silences. Signed-off-by: Guido Trotter <guido@hudson-trading.com> Co-authored-by: Guido Trotter <guido@hudson-trading.com> Signed-off-by: abhishek-dbz <abhibro936@gmail.com>

ultrotter

Thanks, that's useful! It might be worth considering also adding information about what metrics to put in a dashboard or monitoring about alertmanager itself.

ultrotter · 2025-11-07T18:25:00Z

README.md

+**Solutions:**
+- Check for alert storms - large number of unique alert groups
+- Review `group_by` labels in routing configuration
+- Consider using more specific grouping to reduce alert group count


Would this better read "broader", since it sounds like if you go for more specific, you'll get more groups, not fewer?

ultrotter · 2025-11-07T18:25:43Z

README.md

+
+**Solutions:**
+- Check for alert storms - large number of unique alert groups
+- Review `group_by` labels in routing configuration


We can possibly remove this line which doesn't specify how to review them, and merge them with the one below

siavashs · 2025-11-11T14:24:13Z

I'd suggest this move to the docs and not the README.
Then it can be part of https://prometheus.io/docs/guides/

TheMeier · 2025-11-11T17:06:38Z

README.md

+
+#### Cluster peers not connecting
+
+**Symptoms:** Alertmanager instances cannot discover each other in cluster mode.


Maybe it makes sense to add a sentence to detect that this is the case. Eg from logs or from the status-page peer list.

ABHISHEK-DBZ and others added 2 commits November 7, 2025 22:00

ABHISHEK-DBZ force-pushed the docs/add-troubleshooting-section branch from 5f4d4ab to 4fbc391 Compare November 7, 2025 16:31

Merge branch 'main' into docs/add-troubleshooting-section

e077669

ultrotter approved these changes Nov 7, 2025

View reviewed changes

siavashs added the kind/documentation label Nov 11, 2025

TheMeier reviewed Nov 11, 2025

View reviewed changes

Merge branch 'main' into docs/add-troubleshooting-section

c1e1392

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add comprehensive troubleshooting section to README #4711

docs: add comprehensive troubleshooting section to README #4711

Uh oh!

ABHISHEK-DBZ commented Nov 7, 2025

Uh oh!

ultrotter left a comment

Uh oh!

ultrotter Nov 7, 2025

Uh oh!

ultrotter Nov 7, 2025

Uh oh!

siavashs commented Nov 11, 2025

Uh oh!

TheMeier Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		#### Cluster peers not connecting

		Symptoms: Alertmanager instances cannot discover each other in cluster mode.

docs: add comprehensive troubleshooting section to README #4711

Are you sure you want to change the base?

docs: add comprehensive troubleshooting section to README #4711

Uh oh!

Conversation

ABHISHEK-DBZ commented Nov 7, 2025

Uh oh!

ultrotter left a comment

Choose a reason for hiding this comment

Uh oh!

ultrotter Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ultrotter Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

siavashs commented Nov 11, 2025

Uh oh!

TheMeier Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants