Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmd/opampsupervisor] Flaky TestSupervisorConfiguresCapabilities #30355

Closed
djaglowski opened this issue Jan 9, 2024 · 6 comments
Closed

[cmd/opampsupervisor] Flaky TestSupervisorConfiguresCapabilities #30355

djaglowski opened this issue Jan 9, 2024 · 6 comments
Labels

Comments

@djaglowski
Copy link
Member

Component(s)

cmd/opampsupervisor

Describe the issue you're reporting

Possible flaky test observed here:

=== RUN   TestSupervisorConfiguresCapabilities
    e2e_test.go:44: Cannot send message to WebSocket: write tcp 127.0.0.1:41777->127.0.0.1:34682: write: broken pipe
    e2e_test.go:44: Cannot send message to WebSocket: write tcp 127.0.0.1:41777->127.0.0.1:34682: write: broken pipe
    ...
    e2e_test.go:44: Cannot send message to WebSocket: write tcp 127.0.0.1:41777->127.0.0.1:34682: write: broken pipe
    e2e_test.go:44: Cannot send message to WebSocket: write tcp 127.0.0.1:41777->127.0.0.1:34682: write: broken pipe
    e2e_test.go:40: Agent disconnected: websocket: close 1006 (abnormal closure): unexpected EOF
panic: test timed out after 10m0s
running tests:
	TestSupervisorConfiguresCapabilities (9m57s)

goroutine 151 [running]:
testing.(*M).startAlarm.func1()
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:2[24](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:25)1 +0x3c5
created by time.goFunc
	/opt/hostedtoolcache/go/1.20.12/x64/src/time/sleep.go:176 +0x32

goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc000138820, {0xa1c052?, 0x5260c5?}, 0xa49678)
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:1630 +0x405
testing.runTests.func1(0xe421e0?)
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:2036 +0x45
testing.tRunner(0xc000138820, 0xc000155c88)
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:1576 +0x10b
testing.runTests(0xc000115400?, {0xe37e40, 0x3, 0x3}, {0x0?, 0x100c000111448?, 0xe41560?})
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:2034 +0x489
testing.(*M).Run(0xc000115400)
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:1906 +0x63a
main.main()
	_testmain.go:51 +0x1aa

goroutine 108 [select, 9 minutes]:
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).Stop(0xc000342420, {0xad9848, 0xc000032110})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/internal/clientcommon.go:160 +0xda
github.com/open-telemetry/opamp-go/client.(*wsClient).Stop(0xc000342420, {0xad9848, 0xc000032110})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/wsclient.go:92 +0xc8
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).Shutdown(0xc00017c620)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:690 +0x374
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor.TestSupervisorConfiguresCapabilities(0xc0000f17a0?)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/e2e_test.go:324 +0x244
testing.tRunner(0xc00008aea0, 0xa49678)
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
	/opt/hostedtoolcache/go/1.20.12/x64/src/testing/testing.go:1629 +0x3ea

goroutine 109 [IO wait, 9 minutes]:
internal/poll.runtime_pollWait(0x7f6f90cb0e00, 0x72)
	/opt/hostedtoolcache/go/1.20.12/x64/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc000[25](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:26)da80?, 0x4?, 0x0)
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc00025da80)
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_unix.go:614 +0x2bd
net.(*netFD).accept(0xc00025da80)
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc000012b10)
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/tcpsock_posix.go:148 +0x25
net.(*TCPListener).Accept(0xc000012b10)
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/tcpsock.go:297 +0x3d
net/http.(*Server).Serve(0xc000194000, {0xad91e0, 0xc000012b10})
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/http/server.go:3059 +0x385
net/http/httptest.(*Server).goServe.func1()
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/http/httptest/server.go:310 +0x6a
created by net/http/httptest.(*Server).goServe
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/http/httptest/server.go:308 +0x6a

goroutine 11 [runnable]:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).runAgentProcess(0xc00017c1c0)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:604 +0x113
created by github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.NewSupervisor
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:143 +0x825

goroutine 51 [select]:
github.com/cenkalti/backoff/v4.(*Ticker).run(0xc00032c600)
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:70 +0x15e
created by github.com/cenkalti/backoff/v4.NewTickerWithTimer
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:49 +0x1ca

goroutine 74 [select]:
github.com/cenkalti/backoff/v4.(*Ticker).run(0xc0000c0240)
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:70 +0x15e
created by github.com/cenkalti/backoff/v4.NewTickerWithTimer
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:49 +0x1ca

goroutine 111 [IO wait, 9 minutes]:
internal/poll.runtime_pollWait(0x7f6f90cb0ef0, 0x72)
	/opt/hostedtoolcache/go/1.20.12/x64/src/runtime/netpoll.go:306 +0x89
internal/poll.(*pollDesc).wait(0xc0005d6600?, 0xc0000aa000?, 0x0)
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0005d6600, {0xc0000aa000, 0x1000, 0x1000})
	/opt/hostedtoolcache/go/1.20.12/x64/src/internal/poll/fd_unix.go:167 +0x299
net.(*netFD).Read(0xc0005d6600, {0xc0000aa000?, 0x0?, 0xc00009e958?})
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc00009acc0, {0xc0000aa000?, 0x4078fd?, 0xc00009e900?})
	/opt/hostedtoolcache/go/1.20.12/x64/src/net/net.go:183 +0x45
bufio.(*Reader).fill(0xc000288ea0)
	/opt/hostedtoolcache/go/1.20.12/x64/src/bufio/bufio.go:106 +0xff
bufio.(*Reader).Peek(0xc000288ea0, 0x2)
	/opt/hostedtoolcache/go/1.20.12/x64/src/bufio/bufio.go:144 +0x5d
github.com/gorilla/websocket.(*Conn).read(0xc000342000, 0x8bd3a0?)
	/home/runner/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:371 +0x2c
github.com/gorilla/websocket.(*Conn).advanceFrame(0xc000342000)
	/home/runner/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:809 +0x7b
github.com/gorilla/websocket.(*Conn).NextReader(0xc000342000)
	/home/runner/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:1009 +0xcc
github.com/gorilla/websocket.(*Conn).ReadMessage(0xe421e0?)
	/home/runner/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:1093 +0x19
github.com/open-telemetry/opamp-go/client/internal.(*wsReceiver).receiveMessage(0xad9810?, 0xc0005e2c80?)
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/internal/wsreceiver.go:65 +0x25
github.com/open-telemetry/opamp-go/client/internal.(*wsReceiver).ReceiverLoop(0xc0000c9ec0, {0xad9810, 0xc0005e2c80})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/internal/wsreceiver.go:51 +0x86
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle(0xc000342420, {0xad9810, 0xc0005e2c80})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/wsclient.go:244 +0x370
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped(0xc000342420, {0xad9810, 0xc0005e2c80})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/wsclient.go:[26](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:27)6 +0x39
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1()
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/internal/clientcommon.go:199 +0x67
created by github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/client/internal/clientcommon.go:192 +0x165

goroutine 49 [select]:
github.com/cenkalti/backoff/v4.(*Ticker).run(0xc000218a80)
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:70 +0x15e
created by github.com/cenkalti/backoff/v4.NewTickerWithTimer
	/home/runner/go/pkg/mod/github.com/cenkalti/backoff/v4@v4.2.1/ticker.go:49 +0x1ca

goroutine 65 [runnable]:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).runAgentProcess(0xc0003a69a0)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:604 +0x113
created by github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.NewSupervisor
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:143 +0x825

goroutine 30 [chan send, 9 minutes]:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor.newOpAMPServer.func2({0xad9570, 0xc000[29](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:30)ff10})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/e2e_test.go:85 +0x45
github.com/open-telemetry/opamp-go/server.ConnectionCallbacksStruct.OnConnectionClose(...)
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/callbacks.go:50
github.com/open-telemetry/opamp-go/server.(*server).handleWSConnection.func1()
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/serverimpl.go:195 +0xac
github.com/open-telemetry/opamp-go/server.(*server).handleWSConnection(0xc00041a870, 0xc000156c60, {0xad9510, 0xc0000dc2e8})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/serverimpl.go:240 +0x507
created by github.com/open-telemetry/opamp-go/server.(*server).httpHandler
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/serverimpl.go:179 +0x2b7

goroutine 113 [runnable]:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).runAgentProcess(0xc00017c620)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:604 +0x113
created by github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.NewSupervisor
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:143 +0x825

goroutine 149 [chan send, 9 minutes]:
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor.newOpAMPServer.func1({0xad9570?, 0xc00029fff0})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/cmd/opampsupervisor/e2e_test.go:77 +0x85
github.com/open-telemetry/opamp-go/server.ConnectionCallbacksStruct.OnConnected(...)
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/callbacks.go:[33](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:34)
github.com/open-telemetry/opamp-go/server.(*server).handleWSConnection(0xc00041a870, 0xc000[34](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:35)22c0, {0xad9[51](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:52)0, 0xc0000dc0[60](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:61)})
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/serverimpl.go:200 +0x1[68](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:69)
created by github.com/open-telemetry/opamp-go/server.(*server).httpHandler
	/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.10.0/server/serverimpl.go:1[79](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/7462209868/job/20304250008?pr=30339#step:8:80) +0x2b7
exit status 2
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor	600.012s
@djaglowski djaglowski added the needs triage New item requiring triage label Jan 9, 2024
Copy link
Contributor

github-actions bot commented Jan 9, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added the flaky test a test is flaky label Jan 9, 2024
@evan-bradley
Copy link
Contributor

I can't reproduce this, but I'm aware of it. I'll try to take a look soon.

@evan-bradley evan-bradley removed the needs triage New item requiring triage label Feb 7, 2024
Copy link
Contributor

github-actions bot commented Apr 8, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

@djaglowski Have you seen this again recently? I was never able to determine what caused this, and I can't find any instances of it when going through the repo workflow runs. I know we've made a lot of changes to CI, not sure if any one of those may have impacted it.

@djaglowski
Copy link
Member Author

I haven't seen it again so let's close it and reopen if we see more occurrences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants