Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Provide AsyncIO support for generated code #365

Merged
merged 4 commits into from
Jun 17, 2020

Conversation

lidizheng
Copy link
Contributor

Let's see how the CI react to the change.
The test might not pass until changes in googleapis/python-api-core#22 got released.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Apr 9, 2020
Copy link
Contributor

@software-dov software-dov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is WIP, but can you also file an issue to track adding async client support to the ads templates?

Comment on lines 596 to 576
@utils.cached_property
def client_output_async(self):
"""Return the output from the client layer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there's a fair bit of duplication between client_output and client_output_async. Would it be worth combining and parameterizing them? It's not obvious to me just by eyeballing how they differ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I will create a function to hide common logic. In next pass, I will try to improve readability using similar pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. If it turns out to be tricky then no need to twist the logic too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a way to unify this specific method without regression.

  1. The client_output is cached, so it can't accept parameters;
  2. The output of client_output is frozen, so we can't modify the content in it (we can create a new one but it is worse than what we have now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep client_output and client_output_async as the cached layer, then move the guts of the logic into an internal method that takes a parameter. The internal method should not be cached and should not be a property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created the common internal method in 4ecf34d.

@lidizheng
Copy link
Contributor Author

lidizheng commented Apr 14, 2020

This PR may need a smarter way to reuse existing logic. The newly generated class should not be a burden for incoming important features like #359.

@lidizheng
Copy link
Contributor Author

To improve the maintainability of the microgenerator project, we might want to reduce the duplicated logic. Otherwise, new features will have to apply same changes in multiple places, which is quite error-prone. I want to list the options we have to compare:

  1. Composition: all AsyncIO class contains an instance of existing class;
  2. Helper functions: promote duplicated logic between the existing class and AsyncIO class to external functions;
  3. Inheritance: create a common base class for both existing class and AsyncIO class (13fba42);
  4. Duplication: duplicate the logic (a51bcd3).

Options 1, 2, 3 solve the maintainability issue, but have different drawbacks. I can change to either one, if you see fit. WDYT? @software-dov @busunkim96

For option 1, proxying methods from one class to another requires quite some boilerplates. For example, the existing Client class has not only instance method, but also class variables and class methods. The code looks like class_method_a = _comp_class.method_a, method_b = _comp_instance.method_b. Metaclass pattern makes this option even worse which requires its own composition logic. So, the implementation might look hacky.

For option 2, the issue of helper functions is that the duplicated logic usually contains modifying member variables. Passing self out, and allow external function to update it is less ideal. Also, helper functions doesn't solve the member variable initialization issue. The member variable still needs to be defined explicitly.

For option 3, inheritance increases code complexity, and challenges the scope of the interface abstraction. For example, it is unclear that should MTLS feature apply to all transport or only gRPC transport. It also makes it harder for users to understand the logic, since they need to read multiple class instead of one.

@software-dov
Copy link
Contributor

My preferences would be 2, then 1, then 3, then 4. @arithmetic1728 has confirmed that mTLS logic should eventually apply to http transports, and it presumably will also apply to asynchronous clients. I could be persuaded (and please do, if experiments indicate it!) that composition is a less gross hack than helper functions.
With helpers we need to be careful not to wind up duplicating the downsides of inheritance: if a function takes a Client OR an AsyncClient, with explicit type checking and separate logic paths, then that's arguably worse. If we tack on too many accessor methods to facilitate the helper functions, we've just split the implementation between 2 and 3.

I think, in light of the above, I'm leaning more towards composition as a general strategy.

@lidizheng
Copy link
Contributor Author

lidizheng commented Apr 20, 2020

@software-dov I have experiment with the composition pattern to unify logic between Client and AsyncClient, PTAL at commit 3cc730c.

The composition pattern doesn't work cleanly with Transport classes due to function signature difference. AsyncTransport stubs returns Awaitable[Response](coroutine) while gRPC Transport returns Response type directly. Also, the channel object has different type (grpc.Channel vs. aio.Channel). The options are:

  1. Have a common parent class to make types more generic (see 3cc730c);
  2. Similar to option 1, but instead of using inheritance we can use composition to weave the common class and the two public transport classes;
  3. Loosen the type restriction on Transport classes, so it enables GrpcTransport to composite AsyncTransport class;
  4. Let the logic being copied twice (see e176020).

@software-dov WDYT?

@plamut
Copy link
Contributor

plamut commented Apr 21, 2020

@lidizheng If I understand correctly, we want type compatibility between the transport classes grpc.Channel and aio.Channel? Would it help if we cheat by creating an abstract Channel base class and register the two existing Channel classes as its "virtual subclasses"?

@lidizheng
Copy link
Contributor Author

For channels, if we loosen the type check, it is equivalent to option 3. Abstract Channel base class works. Another option is type it with Union[grpc.Channel, aio.Channel]. It stops linter from complaining, but it might be error-prone if user pass-in an AsyncIO channel into a normal gRPC Transport.

@software-dov
Copy link
Contributor

Possibly bad idea: is there any way we can rewrite the synchronous Transport in terms of the async and just force explicit awaits?

I'm a little lost reading through the proposed diffs and the existing aio and regular grpc code. If I understand correctly, the main problem is the return type/semantics of the transport stubs.
If we can come up with a reasonable solution for that, it looks like everything else is a workable problem.

@lidizheng
Copy link
Contributor Author

I'm a little lost reading through the proposed diffs and the existing aio and regular grpc code. If I understand correctly, the main problem is the return type/semantics of the transport stubs.
If we can come up with a reasonable solution for that, it looks like everything else is a workable problem.

Yes, the composition pattern works except the types. The complain can be suppressed by typing.cast if you don't think it is hacky.

Possibly bad idea: is there any way we can rewrite the synchronous Transport in terms of the async and just force explicit awaits?

When I'm writing gRPC AsyncIO, I found even though AsyncIO is the official asynchronous library. People still appreciate the ability to use normal synchronous Python.

@codecov
Copy link

codecov bot commented Jun 4, 2020

Codecov Report

Merging #365 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #365   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           26        26           
  Lines         1453      1466   +13     
  Branches       300       300           
=========================================
+ Hits          1453      1466   +13     
Impacted Files Coverage Δ
gapic/schema/wrappers.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a1263c...52336f8. Read the comment docs.

@@ -789,6 +785,12 @@ def test_transport_instance():
client = {{ service.client_name }}(transport=transport)
assert client._transport is transport

def test_transport_instance_2():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this test verifying? That None credentials default to anonymous?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to test if transport can initialize with no arguments. Just learnt that to use insecure channel, we need anonymous credential. So, this case is removed.

Comment on lines 129 to 130
if scopes is None:
scopes = cls.AUTH_SCOPES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference/style nit:
How about

scopes = cls.AUTH_SCOPES if scopes is None else scopes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to scopes = scopes or cls.AUTH_SCOPES, same below.

@lidizheng
Copy link
Contributor Author

lidizheng commented Jun 10, 2020

Original version of MTLS setup in Client.__init__:

        if isinstance(client_options, dict):
            client_options = ClientOptions.from_dict(client_options)
        if client_options is None:
            client_options = ClientOptions.ClientOptions()

        if transport is None and client_options.api_endpoint is None:
            use_mtls_env = os.getenv("GOOGLE_API_USE_MTLS", "never")
            if use_mtls_env == "never":
                client_options.api_endpoint = self.DEFAULT_ENDPOINT
            elif use_mtls_env == "always":
                client_options.api_endpoint = self.DEFAULT_MTLS_ENDPOINT
            elif use_mtls_env == "auto":
                has_client_cert_source = (
                    client_options.client_cert_source is not None
                    or mtls.has_default_client_cert_source()
                )
                client_options.api_endpoint = (
                    self.DEFAULT_MTLS_ENDPOINT if has_client_cert_source else self.DEFAULT_ENDPOINT
                )
            else:
                raise MutualTLSChannelError(
                    "Unsupported GOOGLE_API_USE_MTLS value. Accepted values: Never, Auto, Always"
                )

        # Save or instantiate the transport.
        # Ordinarily, we provide the transport, but allowing a custom transport
        # instance provides an extensibility point for unusual situations.
        if isinstance(transport, {{ service.name }}Transport):
            # transport is a {{ service.name }}Transport instance.
            if credentials:
                raise ValueError('When providing a transport instance, '
                                 'provide its credentials directly.')
            self._transport = transport
        elif isinstance(transport, str):
            Transport = type(self).get_transport_class(transport)
            self._transport = Transport(
                credentials=credentials, host=self.DEFAULT_ENDPOINT
            )
        else:
            self._transport = {{ service.name }}GrpcTransport(
                credentials=credentials,
                host=client_options.api_endpoint,
                api_mtls_endpoint=client_options.api_endpoint,
                client_cert_source=client_options.client_cert_source,
            )

The behavior of MTLS is kind of strange in some cases.

  1. It assumed MTLS only works with transport=None. It behaves differently when transport=None and transport="grpc", even if they are using the same underlying transport.
  2. It uses a unnecessary hard-coded transport class. get_transport_class(None) returns gRPC transport by default.
  3. Even with USE_MTLS="Never", the test case expects the transport being initialized with a none-empty MTLS endpoint. This confused me that what parameter defines the on-and-off for the MTLS feature. But I kept this behavior untouched.

In this PR, the MTLS logic is slightly updated to resolve above cases. Now, the magical difference between transport=None and transport="grpc" is removed. All MTLS behaviors react consistently to the input client_options and GOOGLE_API_USE_MTLS.

@busunkim96
Copy link
Contributor

@arithmetic1728 Could you take a look at the mTLS changes?

@arithmetic1728
Copy link
Collaborator

Original version of MTLS setup in Client.__init__:

The behavior of MTLS is kind of strange. It took me several hours to make it straight.

  1. It assumed uses won't pass transport="grpc" when using MTLS. It behaves differently when transport=None and transport="grpc", even if they are using the same underlying transport.

Yes, the behavior is supposed to be different when transport is None and transport is given. If user provides transport, mTLS related logic won't be triggered, and client_options won't be used.

  1. It uses a unnecessary hard-coded transport class. get_transport_class(None) returns gRPC transport by default.

I think the library assumes user could provide their own implementation of Transport class, so transport is not necessarily grpc, it could some user defined transport class name.

  1. Even with USE_MTLS="Never", the test case expects the transport being initialized with a none-empty MTLS endpoint. This really confused me that what parameter defines the on-and-off for the MTLS feature.

Which test is it?

@lidizheng
Copy link
Contributor Author

lidizheng commented Jun 10, 2020

@arithmetic1728 Thanks for the quick response. The test I mentioned is this.

    os.environ["GOOGLE_API_USE_MTLS"] = "never"
    with mock.patch('{{ (api.naming.module_namespace + (api.naming.versioned_module_name,) + service.meta.address.subpackage)|join(".") }}.services.{{ service.name|snake_case }}.transports.{{ service.name }}GrpcTransport.__init__') as grpc_transport:
        grpc_transport.return_value = None
        client = {{ service.client_name }}()
        grpc_transport.assert_called_once_with(
            api_mtls_endpoint=client.DEFAULT_ENDPOINT,
            client_cert_source=None,
            credentials=None,
            host=client.DEFAULT_ENDPOINT,
        )

This is the test case that expects api_mtls_endpoint under "never".

Yes, the behavior is supposed to be different when transport is None and transport is given.

I can see benefits in both ways with treating them different or in the same way. By differentiating transport=None and transport="grpc", we can prevent mTLS feature from being a standard feature for transport classes. On the other hand, we can delegate the decision whether to implement mTLS to our users.

If we treat both cases in the same way, the transport class will always receive mTLS arguments. Then, users can decide whether to implement mTLS feature in their customized transport. If they want to ignore those arguments, they should be free to do so.

WDYT about this approach?

@arithmetic1728
Copy link
Collaborator

The logic for mTLS is as follows:

Transport side: mTLS is controlled by the api_mtls_endpoint argument (None=> no mTLS)

Client side:
(1) if transport is provided, no mTLS logic is used.
(2) if transport is None, mTLS logic is triggered as follows:
a) figure out the client certificate to use.
cert_to_use = client_options.client_cert_source or adc_default_cert or None
b) figure out the api endpoint to use. If client_options.api_endpoint is given, it will use the given one; otherwise, api_endpoint is determined based on GOOGLE_API_USE_MTLS env value and cert_to_use.
Note that even if GOOGLE_API_USE_MTLS=never, we still apply the client cert if it exists (The client cert will be ignored by the server). Since mTLS logic in transport constructor is controlled by the api_mtls_endpoint parameter, in this case, the regular endpoint will be passed to the transport constructor via api_mtls_endpoint parameter to make sure client cert is used if exists.

Please don't change the mTLS logic since all languages implement the same logic. Please let me know if you have any questions regarding mTLS.

@arithmetic1728
Copy link
Collaborator

@arithmetic1728 Thanks for the quick response. The test I mentioned is this.

    os.environ["GOOGLE_API_USE_MTLS"] = "never"
    with mock.patch('{{ (api.naming.module_namespace + (api.naming.versioned_module_name,) + service.meta.address.subpackage)|join(".") }}.services.{{ service.name|snake_case }}.transports.{{ service.name }}GrpcTransport.__init__') as grpc_transport:
        grpc_transport.return_value = None
        client = {{ service.client_name }}()
        grpc_transport.assert_called_once_with(
            api_mtls_endpoint=client.DEFAULT_ENDPOINT,
            client_cert_source=None,
            credentials=None,
            host=client.DEFAULT_ENDPOINT,
        )

This is the test case that expects api_mtls_endpoint under "never".

Yes, please see my previous comments.

Let's schedule a meeting to discuss this, it might be easier that way.

@lidizheng
Copy link
Contributor Author

@arithmetic1728 Thanks! See you offline (online).

@arithmetic1728
Copy link
Collaborator

After discussing with Lidi, I think the mTLS change looks good to me. LGTM for the mTLS change.

Copy link
Collaborator

@arithmetic1728 arithmetic1728 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the mTLS change.

@lidizheng
Copy link
Contributor Author

@software-dov @busunkim96 PTALAA.

@lidizheng lidizheng marked this pull request as ready for review June 12, 2020 18:48
@lidizheng lidizheng changed the title WIP: Provide AsyncIO support for generated code feat: Provide AsyncIO support for generated code Jun 12, 2020
Copy link
Contributor

@software-dov software-dov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nothing stands out to me. I'll file an issue for cross patching to alternative templates.
I'm taking it on faith that all the asynchronicity is correct; even after reading multiple blag posts and running local experiments, I still haven't fully internalized it.

tests/system/conftest.py Outdated Show resolved Hide resolved
@lidizheng lidizheng force-pushed the asyncio-support branch 2 times, most recently from bc5bde8 to 833b567 Compare June 16, 2020 17:38
@lidizheng
Copy link
Contributor Author

@software-dov Thanks for the review! I updated conftest.py to use the generalized factory function. About alternative templates, I wonder if Jinja2's include command helps? It can import text files (including another template file).

Copy link
Contributor

@busunkim96 busunkim96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well. Some questions/nits on docstrings and tests.

I think the google-api-core version need to be bumped in the template setup.py.j2.

@lidizheng
Copy link
Contributor Author

@busunkim96 Thanks for reviewing. I'm shamed that I forgot to remove the debug logging. All comments addressed. Do I need to squash all commits?

@busunkim96
Copy link
Contributor

busunkim96 commented Jun 16, 2020

@software-dov What merge style is preferred in this repo?

EDIT:
(Squashed and merged as that's seemed to be what's preferred from the repo commit history 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants