Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic in acceptance tests #3592

Merged

Conversation

grobinson-grafana
Copy link
Contributor

This commit attempts to address a panic that occurs in acceptance tests if a server in the cluster fails to start:

panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0xa0 pc=0x101785858]

goroutine 27 [running]:
testing.tRunner.func1.2({0x101cc73e0, 0x1023c09c0})
	/usr/local/go/src/testing/testing.go:1545 +0x1c8
testing.tRunner.func1()
	/usr/local/go/src/testing/testing.go:1548 +0x360
panic({0x101cc73e0?, 0x1023c09c0?})
	/usr/local/go/src/runtime/panic.go:914 +0x218
github.com/prometheus/alertmanager/test/with_api_v2.(*Alertmanager).Terminate(0x14000a50300)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance.go:386 +0x38
github.com/prometheus/alertmanager/test/with_api_v2.(*AcceptanceTest).Run.func1(0x14000a50300)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance.go:172 +0x28
panic({0x101cc73e0?, 0x1023c09c0?})
	/usr/local/go/src/runtime/panic.go:920 +0x26c
github.com/prometheus/alertmanager/test/with_api_v2.(*Alertmanager).Terminate(0x14000a505a0)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance.go:386 +0x38
github.com/prometheus/alertmanager/test/with_api_v2.(*AcceptanceTest).Run.func1(0x14000a505a0)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance.go:172 +0x28
runtime.Goexit()
	/usr/local/go/src/runtime/panic.go:541 +0x18c
testing.(*common).FailNow(0x14000103ba0)
	/usr/local/go/src/testing/testing.go:999 +0x48
testing.(*common).Fatal(0x14000103ba0, {0x1400092bdb8?, 0x0?, 0x14000a30330?})
	/usr/local/go/src/testing/testing.go:1076 +0x58
github.com/prometheus/alertmanager/test/with_api_v2.(*AcceptanceTest).Run(0x14000792240)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance.go:181 +0x18c
github.com/prometheus/alertmanager/test/with_api_v2/acceptance.TestClusterDeduplication(0x14000103ba0)
	/Users/grobinson/go/src/github.com/prometheus/alertmanager/test/with_api_v2/acceptance/cluster_test.go:56 +0x680
testing.tRunner(0x14000103ba0, 0x101db24f8)
	/usr/local/go/src/testing/testing.go:1595 +0xe8
created by testing.(*T).Run in goroutine 1
	/usr/local/go/src/testing/testing.go:1648 +0x33c
FAIL	github.com/prometheus/alertmanager/test/with_api_v2/acceptance	0.454s
FAIL

It's not the perfect fix, but I'd like to see if it helps and how much. If it does work then we can look at improving on it further.

}

err := t.amc.Start()
if err != nil {
t.T.Fatal(err)
t.T.Log(err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The panic occurred because the test called Fatal which calls runtime.Goexit. This caused the stack to unwind and the defer function to be called. When this happened, am.cmd contained a nil Process field.

@grobinson-grafana grobinson-grafana force-pushed the grobinson/fix-panic-acceptance-tests branch from 040187a to 34bb89f Compare November 2, 2023 20:41
This commit attempts to address a panic that occurs in acceptance
tests if a server in the cluster fails to start.

Signed-off-by: George Robinson <george.robinson@grafana.com>
@grobinson-grafana grobinson-grafana force-pushed the grobinson/fix-panic-acceptance-tests branch from 34bb89f to 2e7e1e4 Compare November 2, 2023 21:07
test/with_api_v2/acceptance.go Outdated Show resolved Hide resolved
test/with_api_v2/acceptance.go Show resolved Hide resolved
Signed-off-by: George Robinson <george.robinson@grafana.com>
@gotjosh gotjosh merged commit 4d6ddd2 into prometheus:main Feb 13, 2024
11 checks passed
gotjosh pushed a commit to grafana/mimir that referenced this pull request Feb 15, 2024
th0th pushed a commit to th0th/alertmanager that referenced this pull request Mar 23, 2024
* Fix panic in acceptance tests

This commit attempts to address a panic that occurs in acceptance
tests if a server in the cluster fails to start.

Signed-off-by: George Robinson <george.robinson@grafana.com>

* Remove started and check am.cmd.Process != nil

Signed-off-by: George Robinson <george.robinson@grafana.com>

---------

Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: Gokhan Sari <gokhan@sari.me>
@grobinson-grafana grobinson-grafana deleted the grobinson/fix-panic-acceptance-tests branch April 16, 2024 14:45
SuperQ added a commit that referenced this pull request Oct 16, 2024
* [CHANGE] Deprecate and remove api/v1/ #2970
* [CHANGE] Remove unused feature flags #3676
* [CHANGE] Newlines in smtp password file are now ignored #3681
* [CHANGE] Change compat metrics to counters #3686
* [CHANGE] Do not register compat metrics in amtool #3713
* [CHANGE] Remove metrics from compat package #3714
* [CHANGE] Mark muted alerts #3793
* [FEATURE] Add metric for inhibit rules #3681
* [FEATURE] Support UTF-8 label matchers #3453, #3507, #3523, #3483, #3567, #3568, #3569, #3571, #3595, #3604, #3619, #3658, #3659, #3662, #3668, 3572
* [FEATURE] Add counter to track alerts dropped outside of time_intervals #3565
* [FEATURE] Add date and tz functions to templates #3812
* [FEATURE] Add limits for silences #3852
* [FEATURE] Add time helpers for templates #3863
* [FEATURE] Add auto GOMAXPROCS #3837
* [FEATURE] Add auto GOMEMLIMIT #3895
* [FEATURE] Add Jira receiver integration #3590
* [ENHANCEMENT] Add the receiver name to notification metrics #3045
* [ENHANCEMENT] Add the route ID to uuid #3372
* [ENHANCEMENT] Add duration to the notify success message #3559
* [ENHANCEMENT] Implement webhook_url_file for discord and msteams #3555
* [ENHANCEMENT] Add debug logs for muted alerts #3558
* [ENHANCEMENT] API: Allow the Silences API to use their own 400 response #3610
* [ENHANCEMENT] Add summary to msteams notification #3616
* [ENHANCEMENT] Add context reasons to notifications failed counter #3631
* [ENHANCEMENT] Add optional native histogram support to latency metrics #3737
* [ENHANCEMENT] Enable setting ThreadId for Telegram notifications #3638
* [ENHANCEMENT] Allow webex roomID from template #3801
* [BUGFIX] Add missing integrations to notify metrics #3480
* [BUGFIX] Add missing ttl in pushhover #3474
* [BUGFIX] Fix scheme required for webhook url in amtool #3409
* [BUGFIX] Remove duplicate integration from metrics #3516
* [BUGFIX] Reflect Discord's max length message limits #3597
* [BUGFIX] Fix nil error in warn logs about incompatible matchers #3683
* [BUGFIX] Fix a small number of inconsistencies in compat package logging #3718
* [BUGFIX] Fix log line in featurecontrol #3719
* [BUGFIX] Fix panic in acceptance tests #3592
* [BUGFIX] Fix flaky test TestClusterJoinAndReconnect/TestTLSConnection #3722
* [BUGFIX] Fix crash on errors when url_file is used #3800
* [BUGFIX] Fix race condition in dispatch.go #3826
* [BUGFIX] Fix race conditions in the memory alerts store #3648
* [BUGFIX] Hide config.SecretURL when the URL is incorrect. #3887
* [BUGFIX] Fix invalid silence causes incomplete updates #3898
* [BUGFIX] Fix leaking of Silences matcherCache entries #3930
* [BUGFIX] Close SMTP submission correctly to handle errors #4006

Signed-off-by: SuperQ <superq@gmail.com>
@SuperQ SuperQ mentioned this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants