Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection reset by peer (httplib2 issue) #2028

Closed
znorris opened this issue Jul 26, 2016 · 20 comments
Closed

Connection reset by peer (httplib2 issue) #2028

znorris opened this issue Jul 26, 2016 · 20 comments
Assignees
Labels
api: datastore Issues related to the Datastore API.

Comments

@znorris
Copy link

znorris commented Jul 26, 2016

I've been attempting to batch upload just over a million entities (split into groups of 100-500 at a time) into datastore but I keep running into a socket.error: [Errno 104] Connection reset by peer error. I've also been experiencing this when attempting to query and pull down all entities of a single kind.

Is there some throttling that should be happening that's not? Any suggestions on how to better utilize the gcloud lib?

@daspecster daspecster added the api: datastore Issues related to the Datastore API. label Jul 26, 2016
@daspecster
Copy link
Contributor

Hello @znorris!
Is the datastore client being run multi-threaded?

@znorris
Copy link
Author

znorris commented Jul 26, 2016

@daspecster No it's not.

@daspecster
Copy link
Contributor

In that case I'm not exactly sure what is causing this. @dhermes or @tseaver might have some ideas.

If you retry after the socket error does it work? Could it possibly be a connectivity issue?

@dhermes
Copy link
Contributor

dhermes commented Jul 26, 2016

Is there a long timeout before the socket error? Connection reset by peer usually means the handshake has broken down (often due to a long delay between requests). If the backend were unhappy with your requests, you'd likely get a status code with a rate-limit error.

@znorris
Copy link
Author

znorris commented Jul 27, 2016

@dhermes that's what I would have thought, some kind of msg telling me I'd hit a limit.

There are no long pauses between batch uploads so I assume it would not timeout there. Perhaps I'm hitting a limit on connection duration?

Export to Datastore stats:
Aprox. entities: 1.08 Million
Properties per entity: 2 (1 string, 1 boolean)
Aprox. size: 100 MB
Entities per put_multi(): 400
Upload duration: 3 hours 10 minutes
Local Bandwidth: 100 Mbps (upload & download)

It is taking much longer than expected to batch upload entities. The library makes it so simple to use I can't think of anything I could be doing differently on my end.

@dhermes
Copy link
Contributor

dhermes commented Aug 15, 2016

@znorris Are you still having this issue? When you say "Upload duration: 3 hours 10 minutes", does that mean for a single request or for a batch of them?

@znorris
Copy link
Author

znorris commented Aug 22, 2016

@dhermes That time was for all batch operations to complete (all 1.08 Million entities).

@dhermes
Copy link
Contributor

dhermes commented Aug 22, 2016

Gotcha. Any idea how long each request is taking? Can you share some code?

@dhermes dhermes changed the title [Datastore] Connection reset by peer Connection reset by peer (httplib2 issue) Aug 23, 2016
@dhermes
Copy link
Contributor

dhermes commented Aug 23, 2016

Also reported in #2166 with logging. In particular, @devashishshankar says

I'm not totally sure, but this could be because of the connection pool being maintained by the underlying httplib2 library. httplib2 is holding connection pools, but doesn't seem to be refreshing them

I am fairly certain this is the issue.

@dhermes
Copy link
Contributor

dhermes commented Aug 23, 2016

The main problem is likely that httplib2.Http keeps a cache

@devashishshankar
Copy link

Yes, seems like that is the issue. Until this gets fixed, is there a temporary workaround I could use? Currently most of my long running jobs fail - and they are also unable to save the state (checkpoint) in datastore because of the same issue. Putting a try catch around every logging statement also doesn't seem feasible.

One hacky solution I can think of is hitting some dummy api call every minute or so in a parallel thread, so the connection remains alive. Perhaps there is some better way?

@dhermes
Copy link
Contributor

dhermes commented Aug 23, 2016

See httplib2shim and the discussion in #1214 to just avoid using httplib2 all together

@devashishshankar
Copy link

So, I'm passing the gcloud_requests client for logging - and it seems to be working fine for now.

@dhermes
Copy link
Contributor

dhermes commented Aug 24, 2016

Glad that helped. I am close to ripping out httplib2 and making it easier to swap out transports. We hope to have requests as the default transport.

@devashishshankar and @znorris Should we leave this issue open or consider it "fixed" (with a bandaid, but still fixed)

@znorris
Copy link
Author

znorris commented Aug 24, 2016

@dhermes I haven't had time to test it myself, but from the conversation it sounds fixed to me. I vote for closing. Thanks for the help everyone.

@mauliksoneji
Copy link

@dhermes @znorris can you please mention from which version onwards this issue has been fixed?

@tseaver
Copy link
Contributor

tseaver commented Aug 13, 2019

@mauliksoneji httplib2 was removed in PR #3674, more than two years ago. google-cloud-datastore 1.3.0 and later use requests instead.

@fucker007
Copy link

fucker007 commented Jul 19, 2020

who can help me.

Traceback (most recent call last):
  File "test_google_translate_translate.py", line 6, in <module>
    word = prepare.translate("I am fucker, we all known something about this word,thers is no fool in the word,but someone get other way to make them believe in it")
  File "/home/nathan/workspace/dmomb/kaggle_dowload_data/google_translate.py", line 16, in translate
    result = self.client.translate(text,target_language='zh')
  File "/home/nathan/.local/lib/python3.6/site-packages/google/cloud/translate_v2/client.py", line 268, in translate
    response = self._connection.api_request(method="POST", path="", data=data)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/cloud/_http.py", line 419, in api_request
    timeout=timeout,
  File "/home/nathan/.local/lib/python3.6/site-packages/google/cloud/_http.py", line 277, in _make_request
    method, url, headers, data, target_object, timeout=timeout
  File "/home/nathan/.local/lib/python3.6/site-packages/google/cloud/_http.py", line 315, in _do_request
    url=url, method=method, headers=headers, data=data, timeout=timeout
  File "/home/nathan/.local/lib/python3.6/site-packages/google/auth/transport/requests.py", line 444, in request
    self.credentials.before_request(auth_request, method, url, request_headers)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/auth/credentials.py", line 133, in before_request
    self.refresh(request)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/oauth2/service_account.py", line 359, in refresh
    access_token, expiry, _ = _client.jwt_grant(request, self._token_uri, assertion)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/oauth2/_client.py", line 153, in jwt_grant
    response_data = _token_endpoint_request(request, token_uri, body)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/oauth2/_client.py", line 105, in _token_endpoint_request
    response = request(method="POST", url=token_uri, headers=headers, body=body)
  File "/home/nathan/.local/lib/python3.6/site-packages/google/auth/transport/requests.py", line 186, in __call__
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.TransportError: ('Connection aborted.', OSError(0, 'Error'))

@tseaver
Copy link
Contributor

tseaver commented Jul 21, 2020

Your traceback is unrelated to this issue. Please use the support links to get support.

@FraserTooth
Copy link

Had an issue with this occasionally when doing long CREATE OR REPLACE MODEL queries using google.cloud.bigquery.Client()
Appears to be fixed by installing httplib2shim.

I made sure I had the latest version of the google libs, so I dunno why httplib2 is still in there 🤷, judging by the trace it looked like it was in for Python2.7 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API.
Projects
None yet
Development

No branches or pull requests

8 participants