You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently created a GCP cluster with adaptive scaling and 0 workers. After ~2 hours I then went to run a task and new workers failed to launch. This is the error that was thrown:
Creating worker instance
Task exception was never retrieved
future: <Task finished name='Task-533' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=BrokenPipeError(32, 'Broken pipe')>
Traceback (most recent call last):
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
return (yield from awaitable.__await__())
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
await self.start()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 321, in start
await self.start_worker()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 325, in start_worker
self.internal_ip, self.external_ip = await self.create_vm()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 188, in create_vm
self.cluster.compute.instances()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 900, in execute
resp, content = _retry_request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 204, in _retry_request
raise exception
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 177, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google_auth_httplib2.py", line 189, in request
self.credentials.before_request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google/auth/credentials.py", line 133, in before_request
self.refresh(request)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google/oauth2/credentials.py", line 200, in refresh
access_token, refresh_token, expiry, grant_response = _client.refresh_grant(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google/oauth2/_client.py", line 248, in refresh_grant
response_data = _token_endpoint_request(request, token_uri, body)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google/oauth2/_client.py", line 105, in _token_endpoint_request
response = request(method="POST", url=token_uri, headers=headers, body=body)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google_auth_httplib2.py", line 116, in __call__
response, data = self.http.request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1985, in request
(response, content) = self._request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1650, in _request
(response, content) = self._conn_request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1558, in _conn_request
conn.request(method, request_uri, body, headers)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1255, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1301, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1250, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1049, in _send_output
self.send(chunk)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 971, in send
self.sock.sendall(data)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/ssl.py", line 1204, in sendall
v = self.send(byte_view[count:])
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/ssl.py", line 1173, in send
return self._sslobj.write(data)
BrokenPipeError: [Errno 32] Broken pipe
I haven't had an issues with adaptive clusters like this when I don't let them idle for as long.
It looks like the auth tokens have expired in that two hours and need to be refreshed. I'm surprised the googleapiclient library doesn't do this for you, but I guess it just doesn't.
I recently created a GCP cluster with adaptive scaling and 0 workers. After ~2 hours I then went to run a task and new workers failed to launch. This is the error that was thrown:
I haven't had an issues with adaptive clusters like this when I don't let them idle for as long.
I'm working off of commit https://github.com/dask/dask-cloudprovider/tree/35deeb415e061ca90973fd24e56b1b7a6f54bc16.
The text was updated successfully, but these errors were encountered: