-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Description
What happened?
Everything was working fine with Selenium Grid 3, but when we switched over to Selenium Grid 4 we started experiencing steps in our test cases sometimes taking ~18 minutes longer to complete. Looking at the logs, it seems that a request gets stuck in a retry loop. We have observed it for several different types of requests (e.g. /execute/sync, /frame/, /se/log).
We are running Selenium Grid with Docker/Podman using the following compose file:
hub:
image: selenium/hub:4.2.0-20220527
container_name: hub
ports:
- "4442:4442"
- "4443:4443"
- "4444:4444"
restart: always
environment:
- SE_SESSION_REQUEST_TIMEOUT=2400
node:
image: selenium/node-chrome:4.2.0-20220527
shm_size: 2gb
container_name: node
depends_on:
- hub
restart: always
environment:
- SE_EVENT_BUS_HOST=hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
- SE_NODE_SESSION_TIMEOUT=300
- SE_NODE_OVERRIDE_MAX_SESSIONS=true
- SE_NODE_MAX_SESSIONS=4
- SCREEN_WIDTH=1920
- SCREEN_HEIGHT=1080
- SCREEN_DEPTH=24
volumes:
- /dev/shm:/dev/shm
ports:
- "9002:5900"
links:
- hub
I'll try to explain the scenario we have observed in the logs, starting from the request before the one that gets stuck.
1. Hub: Sends previous request.
13:36:23.034 DEBUG [NettyRequestSender.sendRequestWithOpenChannel] - Using open Channel [id: 0x7d412036, L:/10.88.0.31:51890 - R:/10.88.0.31:5555] for GET '/session/20135a43e0062e63b125ccdf0a75ab23/element/5e717040-395e-4053-b7b4-07910d640306/enabled'
2. Node: Previous request gets a channel.
13:36:23.036 DEBUG [NettyRequestSender.pollPooledChannel] - Using pooled Channel '[id: 0x34b2bf61, L:/127.0.0.1:39380 - R:localhost/127.0.0.1:60380]' for 'GET' to 'http://localhost:60380/session/20135a43e0062e63b125ccdf0a75ab23/element/5e717040-395e-4053-b7b4-07910d640306/enabled'
13:36:23.036 DEBUG [NettyRequestSender.sendRequestWithOpenChannel] - Using open Channel [id: 0x34b2bf61, L:/127.0.0.1:39380 - R:localhost/127.0.0.1:60380] for GET '/session/20135a43e0062e63b125ccdf0a75ab23/element/5e717040-395e-4053-b7b4-07910d640306/enabled'
3. Node: Previous request is sent and receives response.
13:36:23.043 DEBUG [HttpHandler.handleHttpResponse] -
Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
GET /session/20135a43e0062e63b125ccdf0a75ab23/element/5e717040-395e-4053-b7b4-07910d640306/enabled HTTP/1.1
Accept: application/json, image/png
User-Agent: selenium/4.0.0 (.net linux)
Connection: keep-alive
Cache-Control: no-cache
traceparent: 00-f1d0e00c8e477eda9572995e0b7da776-629612b9dedff6a5-01
host: localhost:60380
Content-Type: application/json; charset=utf-8Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1) HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 cache-control: no-cache content-length: 14
4. Node: Channel for previous request is offered back to pool.
13:36:23.044 DEBUG [ChannelManager.tryToOfferChannelToPool] - Adding key: http://localhost:60380 for channel [id: 0x34b2bf61, L:/127.0.0.1:39380 - R:localhost/127.0.0.1:60380]
5. Hub: Sends new request.
13:36:23.055 DEBUG [NettyRequestSender.sendRequestWithOpenChannel] - Using open Channel [id: 0x7d412036, L:/10.88.0.31:51890 - R:/10.88.0.31:5555] for POST '/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync'
6. Node: New request is processed, but it freezes after "injecting" log (where is normally would say "Using __ channel").
13:36:23.059 DEBUG [HttpTracing.inject] - Injecting (POST) /session/20135a43e0062e63b125ccdf0a75ab23/execute/sync into OpenTelemetrySpan{traceId=a780c6e2ef7b2b647a0c64e1a3a0a9da,spanId=31fc6d6d7e2fbf62} at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler:84
7. Node: About a minuter later the idleChannelDetector finds that the channel used by the previous request is idle, and closes it.
13:37:23.557 DEBUG [DefaultChannelPool$IdleChannelDetector.expiredChannels] - Adding Candidate expired Channel [id: 0x34b2bf61, L:/127.0.0.1:39380 - R:localhost/127.0.0.1:60380] isIdleTimeoutExpired=true isRemotelyClosed=false isTtlExpired=false
13:37:23.557 DEBUG [DefaultChannelPool$IdleChannelDetector.closeChannels] - Closing Idle Channel [id: 0x34b2bf61, L:/127.0.0.1:39380 - R:localhost/127.0.0.1:60380]
8. Hub: About two minuter after that, the hub gets server error and retries (180 s default timeout).
13:39:23.076 DEBUG [RetryRequest.lambda$static$7] - Failure due to server error #1. Retrying.
13:39:23.082 DEBUG [NettyConnectListener.writeRequest] - Using new Channel '[id: 0xc39d1690, L:/10.88.0.31:52094 - R:/10.88.0.31:5555]' for 'POST' to '/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync'
9. Node: the frozen request starts running and gets a channel, then gets stuck again.
13:39:23.087 DEBUG [NettyConnectListener.writeRequest] - Using new Channel '[id: 0x3d99f31f, L:/127.0.0.1:39510 - R:localhost/127.0.0.1:60380]' for 'POST' to '/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync'
10. Node: The new retry request gets frozen after the "injecting" log.
13:39:23.089 DEBUG [HttpTracing.inject] - Injecting (POST) /session/20135a43e0062e63b125ccdf0a75ab23/execute/sync into OpenTelemetrySpan{traceId=a780c6e2ef7b2b647a0c64e1a3a0a9da,spanId=656ae3b1f9818e50} at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler:84
11. Five minutes after the call in step (9) the node gets a timeout exception for the first frozen request.
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown
org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:49)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:98)
at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
at org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:92)
at org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)
at org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:125)
at org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:447)
at org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)
at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.grid.node.Node.execute(Node.java:240)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException
at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:59)
... 32 more
13:44:23.082 DEBUG [SeleniumSpanExporter$1.lambda$export$4] - SpanData{spanContext=ImmutableSpanContext{traceId=a780c6e2ef7b2b647a0c64e1a3a0a9da, spanId=31fc6d6d7e2fbf62, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=a780c6e2ef7b2b647a0c64e1a3a0a9da, spanId=6ee9490e9202f7ef, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=true, valid=true}, resource=Resource{schemaUrl=https://opentelemetry.io/schemas/1.9.0, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.14.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=default, version=null, schemaUrl=null}, name=node.forward_command, kind=INTERNAL, startEpochNanos=1655904983059769000, endEpochNanos=1655905463082764947, attributes=AttributesMap{data={http.method=POST, random.key=e65233dd-8b7d-4030-89f9-c26c7e17877e, span.kind=server, http.target=/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync, error=true}, capacity=128, totalAddedValues=5}, totalAttributeCount=5, events=[ImmutableEventData{name=exception, attributes={exception.message="Unable to execute request: java.util.concurrent.TimeoutException
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown", exception.stacktrace="org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:49)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:98)
at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
at org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:92)
at org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)
at org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:125)
at org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:447)
at org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)
at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.grid.node.Node.execute(Node.java:240)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException
at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:59)
... 32 more
", exception.type="org.openqa.selenium.TimeoutException", http.flavor=1, http.handler_class="org.openqa.selenium.remote.http.Route$PredicatedRoute", http.host="hub:4444", http.method="POST", http.request_content_length="99", http.scheme="HTTP", http.target="/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync", http.user_agent="selenium/4.0.0 (.net linux)"}, epochNanos=1655905463082460855, totalAttributeCount=11}], totalRecordedEvents=1, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=ERROR, description=Kind: UNKNOWN Description:}, hasEnded=true}
13:44:23.082 WARN [SeleniumSpanExporter$1.lambda$export$1] - Unable to execute request: java.util.concurrent.TimeoutException
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown
13:44:23.082 WARN [SeleniumSpanExporter$1.lambda$export$1] - org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.2.0', revision: '86eb611648'
System info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'
Driver info: driver.version: unknown
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:49)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:98)
at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
at org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:92)
at org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)
at org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:125)
at org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:447)
at org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)
at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.grid.node.Node.execute(Node.java:240)
at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException
at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:59)
... 32 more
13:44:23.083 WARN [SeleniumSpanExporter$1.lambda$export$3] - {"traceId": "a780c6e2ef7b2b647a0c64e1a3a0a9da","eventTime": 1655905463082460855,"eventName": "exception","attributes": {"exception.message": "Unable to execute request: java.util.concurrent.TimeoutException\nBuild info: version: '4.2.0', revision: '86eb611648'\nSystem info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'\nDriver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException\nBuild info: version: '4.2.0', revision: '86eb611648'\nSystem info: host: '7cc6f0360350', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-372.9.1.el8.x86_64', java.version: '11.0.15'\nDriver info: driver.version: unknown\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:49)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:98)\n\tat org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:92)\n\tat org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)\n\tat org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:125)\n\tat org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:447)\n\tat org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.grid.node.Node.execute(Node.java:240)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\n\tat java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base\u002fjava.lang.Thread.run(Thread.java:829)\nCaused by: java.util.concurrent.TimeoutException\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)\n\tat org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:59)\n\t... 32 more\n","exception.type": "org.openqa.selenium.TimeoutException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.remote.http.Route$PredicatedRoute","http.host": "hub:4444","http.method": "POST","http.request_content_length": "99","http.scheme": "HTTP","http.target": "\u002fsession\u002f20135a43e0062e63b125ccdf0a75ab23\u002fexecute\u002fsync","http.user_agent": "selenium\u002f4.0.0 (.net linux)"}}
13:44:23.157 DEBUG [TimeoutTimerTask.expire] - Request timeout to localhost/127.0.0.1:60380 after 300000 ms for NettyResponseFuture{currentRetry=0,
isDone=0,
isCancelled=0,
asyncHandler=org.asynchttpclient.AsyncCompletionHandlerBase@144965c7,
nettyRequest=org.asynchttpclient.netty.request.NettyRequest@4595be47,
future=java.util.concurrent.CompletableFuture@43b022a9[Not completed],
uri=http://localhost:60380/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync,
keepAlive=true,
redirectCount=0,
timeoutsHolder=org.asynchttpclient.netty.timeout.TimeoutsHolder@4f3f3e4a,
inAuth=0,
touch=1655905163087} after 300079 ms
13:44:23.157 DEBUG [ChannelManager.closeChannel] - Closing Channel [id: 0x3d99f31f, L:/127.0.0.1:39510 - R:localhost/127.0.0.1:60380]
13:44:23.158 DEBUG [NettyRequestSender.abort] - Aborting Future NettyResponseFuture{currentRetry=0,
isDone=0,
isCancelled=0,
asyncHandler=org.asynchttpclient.AsyncCompletionHandlerBase@144965c7,
nettyRequest=org.asynchttpclient.netty.request.NettyRequest@4595be47,
future=java.util.concurrent.CompletableFuture@43b022a9[Not completed],
uri=http://localhost:60380/session/20135a43e0062e63b125ccdf0a75ab23/execute/sync,
keepAlive=true,
redirectCount=0,
timeoutsHolder=org.asynchttpclient.netty.timeout.TimeoutsHolder@4f3f3e4a,
inAuth=0,
touch=1655905163087}
13:44:23.158 DEBUG [NettyRequestSender.abort] - Request timeout to localhost/127.0.0.1:60380 after 300000 ms
java.util.concurrent.TimeoutException: Request timeout to localhost/127.0.0.1:60380 after 300000 ms
at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43)
at org.asynchttpclient.netty.timeout.RequestTimeoutTimerTask.run(RequestTimeoutTimerTask.java:50)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
at io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
13:44:23.158 DEBUG [AsyncCompletionHandler.onThrowable] - Request timeout to localhost/127.0.0.1:60380 after 300000 ms
java.util.concurrent.TimeoutException: Request timeout to localhost/127.0.0.1:60380 after 300000 ms
at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43)
at org.asynchttpclient.netty.timeout.RequestTimeoutTimerTask.run(RequestTimeoutTimerTask.java:50)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
at io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
-
And it continues like this. The hub retrying every 3 minutes, the request getting stuck in the node after "injecting" log and being released when a new retry request is made. The requests that are started in the node get a timeout exceptions after 5 minutes. Meanwhile other requests for other sessions are running in parallel.
-
After the hub has retried 5 times, the client starts making other calls to the hub with the same session. After that everything seems to work again.
How can we reproduce the issue?
-Relevant log output
Logs in description above.Operating System
Docker
Selenium version
C# 4.2.0
What are the browser(s) and version(s) where you see this issue?
selenium/node-chrome:4.2.0-20220527 but also earlier versions of Selenium Grid 4 image
What are the browser driver(s) and version(s) where you see this issue?
selenium/node-chrome:4.2.0-20220527 but also earlier versions of Selenium Grid 4 image
Are you using Selenium Grid?
4.2.0 (but issue also occurred with earlier versions of Selenium Grid 4)