Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gateway timeout to pubsub pull #2576

Merged

Conversation

daspecster
Copy link
Contributor

@daspecster daspecster commented Oct 20, 2016

Based from #2575.

Implement exception for pubsub subscription pull.

Fixes: #2574

@daspecster daspecster added the api: pubsub Issues related to the Pub/Sub API. label Oct 20, 2016
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 20, 2016
@dhermes
Copy link
Contributor

dhermes commented Oct 20, 2016

Ref: https://github.com/googleapis/googleapis/blob/37cc0e5acae50ee91f00827a7010c3b07dfa5311/google/rpc/code.proto#L69-L70

@dhermes
Copy link
Contributor

dhermes commented Oct 20, 2016

Relates to my comment on #2496

Copy link
Contributor

@dhermes dhermes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending Travis

@tseaver
Copy link
Contributor

tseaver commented Oct 20, 2016

Rather than raising a 504, wouldn't we rather match the semantics of the JSON API and return an empty list?

@daspecster
Copy link
Contributor Author

daspecster commented Oct 20, 2016

@tseaver could do that. How would the user know to retry in that case?

@dhermes
Copy link
Contributor

dhermes commented Oct 20, 2016

@tseaver I am 👎 on swallowing exceptions. It's up to the API implementers to make sure the JSON and gRPC surface is sufficiently similar.

@tseaver
Copy link
Contributor

tseaver commented Oct 20, 2016

@daspecster

How would the user know to retry in that case?

A timeout exception due to no messages isn't really "exceptional." Long-running subscriber processes are going to be retrying anyway, inside a while True sort of loop. Short-lived subscribers just iterate over the (possibly empty) list of messages and exit.

@daspecster
Copy link
Contributor Author

daspecster commented Oct 20, 2016

@tseaver, wouldn't the while True: continue after the DEADLINE_EXCEEDED in that case?

Or I guess I would think that if the deadline was exceeded then the user would need to take action on that information?

@tseaver
Copy link
Contributor

tseaver commented Oct 20, 2016

@dhermes

It's up to the API implementers to make sure the JSON and gRPC surface is sufficiently similar.

We paper over lots of differences between the different API surfaces already: it has been an explicit goal that the user doesn't notice or care about them.

@daspecster I'm arguing that users shouldn't need to care about this gRPC-only timeout exception at all: if we propagate it, then they have to catch it, but it doesn't "mean" anything to them beyond "there are no messages to process," which is exactly what returning an empty sequence tells them.

@daspecster
Copy link
Contributor Author

A timeout exception due to no messages isn't really "exceptional."

@tseaver I'm not seeing where this is the same or different from the ackDeadline or other normal operations.

It sounds like if there were no messages after the deadline then it would just return nothing and not throw an error?

Received Pub/Sub messages. The Pub/Sub system will return zero messages if there are no more available in the backlog. The Pub/Sub system may return fewer than the maxMessages requested even if there are more messages available in the backlog.

https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#pullresponse

@dhermes
Copy link
Contributor

dhermes commented Oct 20, 2016

We paper over lots of differences between the different API surfaces already

@tseaver I'm not disputing that but this is an error being converted into a successful (empty) response. We don't even have a repro yet confirming this is intended / deterministic behavior. We should at least hear from someone on the API team / see official docs that confirm a gRPC 504 is equivalent to an empty result.

@tseaver
Copy link
Contributor

tseaver commented Oct 20, 2016

I'm not seeing where this is the same or different from the ackDeadline or other normal operations.

ackDeadline is a configuration option for the Subscription: pull-mode subscribers are required to ack pulled messages within that interval, or else they get queued for redelivery. It can't be relevant to #2574, as the DEADLINE_EXCEEED message is being raised inside subscription.pull().

@tseaver
Copy link
Contributor

tseaver commented Oct 20, 2016

We should at least hear from someone on the API team / see official docs that confirm a gRPC 504 is equivalent to an empty result.

@tmatsuo Can you chime in here?

@daspecster
Copy link
Contributor Author

@tmatsuo could you check this out when you have a second?

@daspecster daspecster force-pushed the add-gateway-timeout-to-pubsub-pull branch from bd83972 to 2a078e0 Compare October 30, 2016 20:11
@daspecster
Copy link
Contributor Author

daspecster commented Nov 3, 2016

@dhermes @tseaver, what shall I do with this?

@daspecster daspecster force-pushed the add-gateway-timeout-to-pubsub-pull branch from 2a078e0 to ddc98d4 Compare November 14, 2016 21:05
@daspecster
Copy link
Contributor Author

Bump

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2016

@daspecster Can you give snippets / stacktraces of what happens for the exact same client code, but one with gRPC disabled and one with it enabled?

@daspecster
Copy link
Contributor Author

@dhermes @tseaver.

Here is my test file.

from google.cloud import pubsub


client = pubsub.Client()

topic = client.topic('test-padding')
if not topic.exists():
    topic.create()

subscription = topic.subscription('test-padding')

if not subscription.exists():
    subscription.create()

messages = subscription.pull()
print(messages)

With gRPC:

$ python test_pubsub_padding.py
Traceback (most recent call last):
  File "test_pubsub_padding.py", line 18, in <module>
    messages = subscription.pull()
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/cloud/pubsub/subscription.py", line 328, in pull
    self.full_name, return_immediately, max_messages)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/cloud/pubsub/_gax.py", line 414, in subscription_pull
    return_immediately=return_immediately)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/cloud/gapic/pubsub/v1/subscriber_api.py", line 545, in pull
    return self._pull(request, options)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/gax/api_callable.py", line 481, in inner
    return api_caller(api_call, this_settings, request)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/gax/api_callable.py", line 469, in base_caller
    return api_call(*args)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/gax/api_callable.py", line 434, in inner
    errors.create_error('RPC failed', cause=exception))
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/gax/api_callable.py", line 430, in inner
    return a_func(*args, **kwargs)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/google/gax/api_callable.py", line 64, in inner
    return a_func(*updated_args, **kwargs)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/grpc/_channel.py", line 481, in __call__
    return _end_unary_response_blocking(state, False, deadline)
  File "/Users/daspecster/.virtualenvs/gct/lib/python2.7/site-packages/grpc/_channel.py", line 432, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
google.gax.errors.GaxError: GaxError(RPC failed, caused by <_Rendezvous of RPC that terminated with (StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded)>)
E1121 11:08:01.032859000 140736417121216 network_status_tracker.c:48] Memory leaked as all network endpoints were not shut down

With gRPC disabled:

$ GOOGLE_CLOUD_DISABLE_GRPC=true python test_pubsub_padding.py
[]

@dhermes
Copy link
Contributor

dhermes commented Nov 21, 2016

I didn't want it to be true, so I stripped out our library and just made low(ish)-level requests with each transport: https://gist.github.com/dhermes/ac47b7b6eebfd46febbd9ee3d824436d

Unfortunately it IS true. The HTTP surface returns a 200 and the gRPC surface returns with an error set to DEADLINE_EXCEEDED.

After confirming this sad "contradiction", I now agree with @tseaver that we should wrap the DEADLINE_EXCEEDED and just return whatever the equivalent to an empty result set is.

@dhermes
Copy link
Contributor

dhermes commented Nov 21, 2016

60s to wait for a timeout is a long time, but I feel like if we do make this assumption (i.e. that the surfaces differ), then we should bake that assumption into the system tests, so we can quickly fix it if the assumption ceases to be true.

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm.

@googlebot googlebot removed the cla: yes This human has signed the Contributor License Agreement. label Nov 22, 2016
@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Nov 22, 2016
@daspecster daspecster force-pushed the add-gateway-timeout-to-pubsub-pull branch 2 times, most recently from 84c04a4 to 724c428 Compare November 22, 2016 19:25
@googlebot
Copy link

CLAs look good, thanks!

@googlebot googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Nov 22, 2016
Copy link
Contributor

@dhermes dhermes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need an extra unit test to check the branch I added (i.e. we only suppress the DEADLINE_EXCEEDED if return immediately was false)

@@ -415,6 +415,8 @@ def subscription_pull(self, subscription_path, return_immediately=False,
except GaxError as exc:
if exc_to_code(exc.cause) == StatusCode.NOT_FOUND:
raise NotFound(subscription_path)
if exc_to_code(exc.cause) == StatusCode.DEADLINE_EXCEEDED:

This comment was marked as spam.

This comment was marked as spam.

@daspecster daspecster force-pushed the add-gateway-timeout-to-pubsub-pull branch from f18e454 to c9c00b7 Compare November 23, 2016 22:18
@daspecster
Copy link
Contributor Author

Squashed!

raise NotFound(subscription_path)
elif exc_to_code(exc.cause) == StatusCode.DEADLINE_EXCEEDED:

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@daspecster daspecster force-pushed the add-gateway-timeout-to-pubsub-pull branch from c9c00b7 to 892e34e Compare November 23, 2016 22:23
@daspecster daspecster merged commit 8001d45 into googleapis:master Nov 24, 2016
@daspecster daspecster deleted the add-gateway-timeout-to-pubsub-pull branch November 24, 2016 01:45
richkadel pushed a commit to richkadel/google-cloud-python that referenced this pull request May 6, 2017
…t-to-pubsub-pull

Add gateway timeout to pubsub pull
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. backend cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants