Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP broken pipe error #219

Open
eric-czech opened this issue Dec 11, 2020 · 5 comments
Open

GCP broken pipe error #219

eric-czech opened this issue Dec 11, 2020 · 5 comments
Labels
bug Something isn't working provider/gcp/vm Cluster provider for GCP Instances

Comments

@eric-czech
Copy link
Contributor

I noticed this error when running a 50 node cluster:

... # After issuing command to scale up to 50 nodes (not adaptive)
dask-454ca9f7-worker-3eacd15f
      Internal IP: 10.142.15.192
      External IP: None
dask-454ca9f7-worker-05101041
      Internal IP: 10.142.15.193
      External IP: None
dask-454ca9f7-worker-59a6ca27
      Internal IP: 10.142.15.194
      External IP: None
dask-454ca9f7-worker-ea64bc32
      Internal IP: 10.142.15.195
      External IP: None
dask-454ca9f7-worker-f514eaaf
      Internal IP: 10.142.15.196
      External IP: None
dask-454ca9f7-worker-674c4e80
      Internal IP: 10.142.15.197
      External IP: None
Task exception was never retrieved
future: <Task finished name='Task-760' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=BrokenPipeError(32, 'Broken pipe')>
Traceback (most recent call last):
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
  return (yield from awaitable.__await__())
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
  await self.start()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
  await self.start_worker()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
  self.internal_ip, self.external_ip = await self.create_vm()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 193, in create_vm
  self.cluster.compute.instances()
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
  return wrapped(*args, **kwargs)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 900, in execute
  resp, content = _retry_request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 204, in _retry_request
  raise exception
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/googleapiclient/http.py", line 177, in _retry_request
  resp, content = http.request(uri, method, *args, **kwargs)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/google_auth_httplib2.py", line 200, in request
  response, content = self.http.request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1985, in request
  (response, content) = self._request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1650, in _request
  (response, content) = self._conn_request(
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/httplib2/__init__.py", line 1558, in _conn_request
  conn.request(method, request_uri, body, headers)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1255, in request
  self._send_request(method, url, body, headers, encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1301, in _send_request
  self.endheaders(body, encode_chunked=encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1250, in endheaders
  self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 1049, in _send_output
  self.send(chunk)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/http/client.py", line 971, in send
  self.sock.sendall(data)
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/ssl.py", line 1204, in sendall
  v = self.send(byte_view[count:])
File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/ssl.py", line 1173, in send
  return self._sslobj.write(data)
BrokenPipeError: [Errno 32] Broken pipe

I only end up with 49 workers (and 49 worker VMs) so I believe this caused one of them not to launch.

This same error occurs in #218.

@eric-czech
Copy link
Contributor Author

eric-czech commented Dec 11, 2020

FWIW, I tried adding 50 more nodes to this cluster and it managed to add 22 more nodes before failing like this:

Logs
n [6]: scale(100)
Cluster rescaled

Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
Creating worker instance
dask-454ca9f7-worker-69a1cac8
        Internal IP: 10.142.15.198
        External IP: None
dask-454ca9f7-worker-e26c9406
        Internal IP: 10.142.15.219
        External IP: None
dask-454ca9f7-worker-a032fa29
        Internal IP: 10.142.15.220
        External IP: None
dask-454ca9f7-worker-444f21bb
        Internal IP: 10.142.15.221
        External IP: None
dask-454ca9f7-worker-09d70c73
        Internal IP: 10.142.15.222
        External IP: None
dask-454ca9f7-worker-84f6cff5
        Internal IP: 10.142.15.223
        External IP: None
dask-454ca9f7-worker-10d1f971
        Internal IP: 10.142.15.224
        External IP: None
dask-454ca9f7-worker-ad17c266
        Internal IP: 10.142.15.225
        External IP: None
dask-454ca9f7-worker-3c3b2787
        Internal IP: 10.142.15.226
        External IP: None
dask-454ca9f7-worker-2baaacfc
        Internal IP: 10.142.15.227
        External IP: None
dask-454ca9f7-worker-9ede13de
        Internal IP: 10.142.15.228
        External IP: None
dask-454ca9f7-worker-76533b2b
        Internal IP: 10.142.15.229
        External IP: None
dask-454ca9f7-worker-06938cac
        Internal IP: 10.142.15.230
        External IP: None
dask-454ca9f7-worker-46e00bd1
        Internal IP: 10.142.15.231
        External IP: None
dask-454ca9f7-worker-aff679e2
        Internal IP: 10.142.15.232
        External IP: None
dask-454ca9f7-worker-bff91d99
        Internal IP: 10.142.15.233
        External IP: None
dask-454ca9f7-worker-ae562433
        Internal IP: 10.142.15.234
        External IP: None
dask-454ca9f7-worker-aa6fd84d
        Internal IP: 10.142.15.235
        External IP: None
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Failed to find running VMI...
{'id': 'projects/uk-biobank-279813/zones/us-east1-c/instances', 'selfLink': 'https://www.googleapis.com/compute/v1/projects/uk-biobank-279813/zones/us-east1-c/instances', 'kind': 'compute#instanceList'}
Task exception was never retrieved
future: <Task finished name='Task-1482' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-7b68de3a')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-7b68de3a
Task exception was never retrieved
future: <Task finished name='Task-1508' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-38a39620')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-38a39620
Task exception was never retrieved
future: <Task finished name='Task-1502' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-731502ff')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-731502ff
Task exception was never retrieved
future: <Task finished name='Task-1486' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-b71b73c8')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-b71b73c8
Task exception was never retrieved
future: <Task finished name='Task-1509' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-38b35747')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-38b35747
Task exception was never retrieved
future: <Task finished name='Task-1483' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-ee90e292')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-ee90e292
Task exception was never retrieved
future: <Task finished name='Task-1513' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-44faf8fa')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-44faf8fa
Task exception was never retrieved
future: <Task finished name='Task-1503' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-4540b31e')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-4540b31e
Task exception was never retrieved
future: <Task finished name='Task-1501' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-c08cdc0c')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-c08cdc0c
Task exception was never retrieved
future: <Task finished name='Task-1484' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-b21a1788')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-b21a1788
Task exception was never retrieved
future: <Task finished name='Task-1490' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-88f39488')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-88f39488
Task exception was never retrieved
future: <Task finished name='Task-1485' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-18ff5658')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-18ff5658
Task exception was never retrieved
future: <Task finished name='Task-1498' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-8172f4bf')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-8172f4bf
Task exception was never retrieved
future: <Task finished name='Task-1504' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-15ac56c3')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-15ac56c3
Task exception was never retrieved
future: <Task finished name='Task-1505' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-24915a8d')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-24915a8d
Task exception was never retrieved
future: <Task finished name='Task-1496' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-a6c0ea3c')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-a6c0ea3c
Task exception was never retrieved
future: <Task finished name='Task-1491' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-699bd0fd')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-699bd0fd
Task exception was never retrieved
future: <Task finished name='Task-1488' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-53f55cb2')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-53f55cb2
Task exception was never retrieved
future: <Task finished name='Task-1497' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-aae513d3')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-aae513d3
Task exception was never retrieved
future: <Task finished name='Task-1499' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-a1f4bc5b')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-a1f4bc5b
Task exception was never retrieved
future: <Task finished name='Task-1510' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-0d95e147')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-0d95e147
Task exception was never retrieved
future: <Task finished name='Task-1493' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-1eba49f1')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-1eba49f1
Task exception was never retrieved
future: <Task finished name='Task-1487' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-dc29918a')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-dc29918a
Task exception was never retrieved
future: <Task finished name='Task-1494' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-b198daf8')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-b198daf8
Task exception was never retrieved
future: <Task finished name='Task-1511' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-e54e479b')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-e54e479b
Task exception was never retrieved
future: <Task finished name='Task-1495' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-392109fc')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-392109fc
Task exception was never retrieved
future: <Task finished name='Task-1500' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-26b0136b')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-26b0136b
Task exception was never retrieved
future: <Task finished name='Task-1506' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-41ad62c7')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-41ad62c7
Task exception was never retrieved
future: <Task finished name='Task-1507' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-5318f76e')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-5318f76e
Task exception was never retrieved
future: <Task finished name='Task-1492' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-202482cd')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-202482cd
Task exception was never retrieved
future: <Task finished name='Task-1512' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-caca75a6')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-caca75a6
Task exception was never retrieved
future: <Task finished name='Task-1489' coro=<_wrap_awaitable() done, defined at /home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py:685> exception=Exception('Missing Instance dask-454ca9f7-worker-80e4b999')>
Traceback (most recent call last):
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/asyncio/tasks.py", line 692, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 335, in start
    await self.start_worker()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 339, in start_worker
    self.internal_ip, self.external_ip = await self.create_vm()
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 203, in create_vm
    while self.update_status() != "RUNNING":
  File "/home/eczech/miniconda3/envs/cloudprovider/lib/python3.8/site-packages/dask_cloudprovider/gcp/instances.py", line 241, in update_status
    raise Exception(f"Missing Instance {self.name}")
Exception: Missing Instance dask-454ca9f7-worker-80e4b999

In the end, I have 67 workers (also 67 VMs).

Update

This second set of errors can be ignored. This is happening because I'm hitting IP quota limits at 67 nodes. The problem in the OP remains though.

@quasiben
Copy link
Member

Do you think this due to a timeout issue?

@eric-czech
Copy link
Contributor Author

It could be since there was a delay between when I created the cluster and when I tried to scale it up, somewhat like in #179. In this case, it worked 49/50 times whereas in #179 no new workers were created at all. The stack traces look to be about the same so I'm not sure why the fix for that issue didn't work here if it's the same problem.

@eric-czech
Copy link
Contributor Author

I got this again today after creating an adaptive cluster and then almost immediately (within 5 minutes) having it scale up by running a large number of tasks. It's making adaptive clusters pretty much unusable because after this error, they stop scaling up or down.

Perhaps _retry_request in googleapiclient.http is not actually retrying this (it's just a first attempt wrapped in retriable logic with a default of 0 retries)?

@jacobtomlinson
Copy link
Member

I expect this is due to a race condition. We create an instance

inst = (
self.cluster.compute.instances()
.insert(project=self.projectid, zone=self.zone, body=self.gcp_config)
.execute()
)

Then immediately list all instances to find out info about it

d = (
self.cluster.compute.instances()
.list(project=self.projectid, zone=self.zone, filter=f"name={self.name}")
.execute()
)

When creating a small number of instances this is probably fine. However when requesting larger number of instances then GCP may not be as quick to fulfil the requests.

From a quick scan of the code I think we need to implement a retry in update_status and also change get_internal_ip and get_external_ip to use update_status and then get their values from self.gcp_inst.

The dask_cloudprovider.utils.timeout.Timeout utility is designed to make this easier to implement.

class Timeout:
"""A timeout object for use in ``while True`` loops instead of ``True``.

@jacobtomlinson jacobtomlinson added bug Something isn't working provider/gcp/vm Cluster provider for GCP Instances labels Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working provider/gcp/vm Cluster provider for GCP Instances
Projects
None yet
Development

No branches or pull requests

3 participants