Adding generated Bigtable classes. #22

saicheems · 2015-11-30T19:53:28Z

No description provided.

saicheems · 2015-11-30T19:54:09Z

dhermes · 2015-11-30T19:55:22Z

@saicheems Can you explain what these are and what generated them and why they are needed given the existence of gcloud_bigtable._generated?

geigerj · 2015-11-30T20:09:50Z

These are intended to replace the gRPC generated code that currently is in gcloud_bigtable._generated, for some common use cases. They hide some gRPC features, like explicit timeouts and request objects for every call, and are backed by the GAX library -- not included in this PR -- which performs things like retrying, page-streaming, and possibly eventually other features that might be desired of the callables. We're hoping that this will reduce the amount of code that needs to be handwritten on top of the gRPC generated code. They were generated by our code generator, which is not yet on github.

dhermes · 2015-11-30T20:16:15Z

Before continuing I should note that we are in the process of merging gcloud-python-bigtable into the main gcloud-python, however this one remains the upstream master until they are at parity.

Is there a link to the GAX library?
Does importing this work? The generated code assumed the google namespace package would be available but it was more trouble than it was worth, so I re-wrote it. Right now the gcloud_bigtable.google package looks public to users of this library, but we'd rather hide it as an implementation detail. How hard will that be to put it in a different place?
Do you still depend on the protobuf PyPI package?
Can you give some sample code explaining how to use this?
Do you have any documentation? I'm fairly uncomfortable depending on something undocumented.

/cc @jgeewax @tseaver

geigerj · 2015-11-30T20:27:14Z

GAX link will come later. I think this is intended only to get feedback on the public surface of the generated code, not to actually run or be merged; maybe Sai can comment?
I think we'll reorganize the package names before sending a working pull request. Most likely we'll use a package named "generated.google". For non-manually veneered APIs (unlike this one), the generated code will be the public surface with nothing sitting on top -- not sure how this will play into package naming if we need manually veneered codegen to look private.
Yes, we depend on the protobuf PyPI package
@saicheems / @anthmgoogle -- should we be including sample code at this point?
The documentation story is still in the works, I think.

saicheems · 2015-11-30T20:30:54Z

Right, GAX is still in the works - we'll update when available but it's not intended to be a part of this review.

@geigerj @anthmgoogle Maybe we can include the end to end example for demonstration purposes?

geigerj · 2015-11-30T20:38:22Z

I think that makes sense -- it's not necessary intended to be the final sample code, but it should give some sense of how it might look.

dhermes · 2015-11-30T20:38:48Z

I don't expect GAX to be part of this review. I'd expect it to be include-able as a PyPI package.

saicheems · 2015-11-30T20:42:30Z

@dhermes Right, our intention is to eventually have it available as a package.

I'll include our end to end example.

geigerj · 2015-11-30T20:44:34Z

@dhermes @saicheems Yes, although note that as a temporary solution, we may have to put it in gcloud-python, similar to what happed with Java GAX (googleapis/google-cloud-java#320). We haven't yet worked out a permanent home for GAX.

dhermes · 2015-11-30T21:21:17Z

GitHub and PyPI are a great start to be part of the Python community

dhermes · 2015-11-30T21:22:20Z

@geigerj So the current most pressing need is just to review the API surface? (And the forthcoming example usage docs)

geigerj · 2015-11-30T21:34:15Z

Yup, that's right.

anthmgoogle · 2015-11-30T21:49:11Z

For the owners of gcloud-python, this PR is not intended for merging. but we are looking for feedback and sign-off on the APi surface, and also details of the documentation and source code. In parallel we are working on a PR that will be mergable with all dependencies and layout issues resolved. But it will include these files. If these files need to be changed, we would like to iterate on any changes in this PR now.

For reference, there was a lot of good feedback on the Java equivalent of this that took a while to iterate through, so we would like to get going on the Python equivalent now to avoid a long delay when the mergable version is ready.

anthmgoogle · 2015-11-30T21:52:02Z

gcloud_bigtable/google/bigtable/admin/cluster/v1/bigtable_cluster_service_api.py

+     * New tables can be created in the cluster.
+     * The cluster's allocated resource levels will be readable via the API.
+    The embedded operation's "metadata" field type is
+    [CreateClusterMetadata][google.bigtable.admin.cluster.v1.CreateClusterMetadata] The embedded operation's "response" field type is


Could we strip out the link headers so that this is ".. field type is google.bigtable.admin.cluster.v1.CreaseClusterMetadata. The ..."

👍 It may be a pain unless you have something that can easily parse it from the comments in the .proto definition.

If you can parse it, you should reformat in Sphinx / RsT format to link within the generated classes.

We're working on improving the docs as we can right now, and have an issue tracking it.

dhermes · 2015-12-01T01:09:13Z

I haven't read the code yet, but have gone through e2e.py. It is essentially identical to writing an application with the standard code generated by protoc (using grpc_python_plugin).

Maybe I'm misunderstanding what the goal was?

dhermes · 2015-12-01T01:09:40Z

gcloud_bigtable/google/bigtable/admin/cluster/v1/bigtable_cluster_service_api.py

+# EDITING INSTRUCTIONS
+# This file was generated from google/bigtable/admin/cluster/v1/bigtable_cluster_service.proto,
+# and updates to that file get reflected here through a regular refresh
+# process. However, manual and updates to that file get reflected here


manual and updates to that file doesn't make sense?

dhermes · 2015-12-01T01:22:36Z

gcloud_bigtable/google/bigtable/admin/cluster/v1/bigtable_cluster_service_api.py

+  # Properties
+  @property
+  def channel(self):
+    return self.channel


This is circular.

>>> import sys >>> sys.setrecursionlimit(4) >>> class A(object): ... @property ... def b(self): ... return self.b ... >>> a = A() >>> a.b Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in b RuntimeError: maximum recursion depth exceeded

anthmgoogle · 2015-12-01T01:51:27Z

"It is essentially identical to writing an application with the standard code generated by protoc (using grpc_python_plugin).Maybe I'm misunderstanding what the goal was?"

The code generator can take advantage of patterns and conventions in Google APis like page streaming. Unfortunately, BigTable is an atypical API that uses few standard patterns and is mostly custom methods. So the generator does not save as much user code as other APIs.

There should still be some benefits even in the absense of pattern use:

There are still savings to initialization code.
The callable pattern makes it more concise to use same advanced gRPC patterns.
There is an editable doc site for each method (unlike gRPC/stubby) where we can get proto-doc by default, but we can add usage examples.
As we detect client-side features in use by multiple APIs, new features can be added over time that can be auto-generated across languages.

dhermes · 2015-12-01T01:51:54Z

gcloud_bigtable/google/bigtable/v1/bigtable_service_api.py

+  """Service for reading from and writing to existing Bigtables."""
+
+  # The default address of the logging service.
+  _SERVICE_ADDRESS = "bigtable.googleapis.com"


I just noticed this uses double quotes while the scopes below use single quotes. Why mix and match? ISTM that using %r or {!r} or repr() would be what you'd want to do, and all of those would produce single quotes for the string above.

dhermes · 2015-12-01T02:05:22Z

The code generator can take advantage of patterns and conventions in Google APis like page streaming

There are ~~three~~ two (a, b) methods that use streaming but they seem to be identical.

There are still savings to initialization code.

Where? It seems to be identical, and calling many methods on stubs requires the same level of knowledge about the generated message classes (seems to occur when a property is itself a message).

The callable pattern makes it more concise to use same advanced gRPC patterns.

I'm not clear what you mean. Side-by-side:

# Veneer generated
cluster_pb = cluster_api.create_cluster(
    name=zone_name, cluster_id=disp_name, cluster=cluster)
# Standard protoc generated (with gRPC plugin)
cluster_pb = cluster_stub.CreateCluster(
    request_pb, timeout_seconds)

The only distinction is that the Veneer call constructs the request_pb for the user.

There is an editable doc site for each method (unlike gRPC/stubby) where we can get proto-doc by default, but we can add usage examples.

I'm not sure what you mean by this.

As we detect client-side features in use by multiple APIs, new features can be added over time that can be auto-generated across languages.

But the exact same can be said of the existing toolchain.

anthmgoogle · 2015-12-01T02:22:33Z

Some clarifications related to my earlier comment.

Page streaming what we call the pattern of List methods returning a page at a time and turning this into an idiomatic collection. Most APIs have some of these but bigtable does not.

The initialization saving is stuff the like the scopes relative to raw gRPC, not the per method stuff.

The advanced calling patterns I meant whether things like Async, chaining, batching, etc.

Regarding docs; gRPC and stubby methods cannot be extended in source form. So you can't go an add a language-speicific usage example to their doc cmments. That is possible with these wrappers.

Regarding future features, I'm not sure what you mean by existing toolchain. But if you mean gRPC, there are some features that cannot be added or unlikely to be added because they would depend on One Platform patterns or would need to depend on configuration outside the proto.

But overall it is acknowledged that the before and after on this particular API is not very different because of its atypical nature. It was much more evident in APIs like PubSub and Logging. We should discuss if that changes our strategy here.

dhermes · 2015-12-01T02:36:02Z

Page streaming what we call the pattern of List methods

Gotcher. I get it now.

The initialization saving is stuff the like the scopes relative to raw gRPC, not the per method stuff.

There is no reason that the gRPC generator couldn't do this. Though (I asked elsewhere in this PR) I don't actually know where the scopes and the API endpoints come from.

The advanced calling patterns I meant whether things like Async, chaining, batching, etc.

Where does that come into play? Why wouldn't the gRPC / protobuf code support it?

Regarding docs; gRPC and stubby methods cannot be extended in source form. So you can't go an add a language-speicific usage example to their doc cmments. That is possible with these wrappers.

That is not very relevant. Most library devs are going to wrap the generated code into something else anyway.

Even if this code was easier to use, the fact that there are 3 distinct services is a reason enough to wrap the Bigtable APIs into something higher-level to make it easier on users. (In our implementation we just concentrate on the nouns and verbs relevant to the API. It may help to take a peek at the docs.)

Regarding future features, I'm not sure what you mean by existing toolchain. But if you mean gRPC, there are some features that cannot be added or unlikely to be added because they would depend on One Platform patterns or would need to depend on configuration outside the proto.

I mean the protoc code generator with the grpc_python_plugin. I don't know what "One Platform" is, and generally am only concerned with publicly documented things users can see / already know about.

It seems this may be a hint at an answer to "where do the API scopes and URIs come from". If that's the case, why not provide those configuration values in a public way just like the .proto files so external developers can build their own toolchains or extend this one as they see fit?

It was much more evident in APIs like PubSub and Logging

We can run the generator on Pub/Sub and then get an idea of the differences there?

Overall, it seems like we are just adding another abstraction layer. For Bigtable at least, the current layers are

Happybase library (to emulate a popular HBase/thrift library)
Our low-level API mapping
The gRPC generated code
The Python protobuf library
The layer to just run Python (much slower than native)
The gRPC core (C++) code

It just doesn't seem worth it.

anthmgoogle · 2015-12-01T17:19:15Z

The feedback provided on the code-gen is very valuable. It would be much appreciated if we could keep iterating until these point issues are either addressed or captured as work items. For the purposes of this review it would good to separate out these higher level questions.

For the higher level questions we should cover these in a different forum, but this boils down to 2 categories:

This specific API will have a lot less code-gen saving than others because it has atypical patterns. As such we should reevaluate whether it should go first. We might, for example, apply all the feedback on the details, but select a different API to focus on.
Broader questions such as whether code-gen features go in gRPC or another layer is complicated and best described elsewhere.

dhermes · 2015-12-01T18:21:51Z

OK. Let me know what else I can do?

anthmgoogle · 2015-12-01T19:43:32Z

We will process the feedback you gave us. If you can review any updates and comment on whether the resolve the issue that would be much appreciated.

dhermes · 2015-12-01T19:45:47Z

Sure thing. Ping me when the review is needed.

dhermes · 2015-12-04T21:18:33Z

@anthmgoogle I'm realizing in porting some of this over that a nice utility your library could provide would be to automatically convert Any messages into the correct type (based on the type_url).

Feature request?

PS You should really start a GitHub project, even if there is no code and just an issue tracker for feature requests.

saicheems · 2015-12-07T19:41:03Z

Thanks for the feedback, we made changes to address the immediately resolvable ones. We have issues tracking some of the more difficult problems - mainly:

Clearer, idiomatic (as much as possible) documentation
Better formatting
Parameter flattening

dhermes · 2015-12-14T18:47:11Z

gcloud_bigtable/google/bigtable/admin/cluster/v1/bigtable_cluster_service_api.py

+# Manual additions are allowed because the refresh process performs
+# a 3-way merge in order to preserve those manual additions. In order to not
+# break the refresh process, only certain types of modifications are
+# allowed.


This just seems like a waste of time trying to implement this feature.

dhermes · 2015-12-14T19:01:20Z

I don't see the value in the *_callable methods. Do you expect a user to actually think it's a good thing to have two methods with similar names, one of which they'll never use?

saicheems added 2 commits November 30, 2015 12:47

Adding generated Bigtable classes.

9462742

Adding end to end example.

7463064

anthmgoogle reviewed Nov 30, 2015
View reviewed changes

dhermes reviewed Dec 1, 2015
View reviewed changes

Addressing round 1 feedback.

26e4c41

dhermes reviewed Dec 14, 2015
View reviewed changes

saicheems closed this Feb 3, 2016

Adding generated Bigtable classes. #22

Adding generated Bigtable classes. #22

Conversation

saicheems commented Nov 30, 2015

saicheems commented Nov 30, 2015

dhermes commented Nov 30, 2015

geigerj commented Nov 30, 2015

dhermes commented Nov 30, 2015

geigerj commented Nov 30, 2015

saicheems commented Nov 30, 2015

geigerj commented Nov 30, 2015

dhermes commented Nov 30, 2015

saicheems commented Nov 30, 2015

geigerj commented Nov 30, 2015

dhermes commented Nov 30, 2015

dhermes commented Nov 30, 2015

geigerj commented Nov 30, 2015

anthmgoogle commented Nov 30, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhermes commented Dec 1, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anthmgoogle commented Dec 1, 2015

Choose a reason for hiding this comment

dhermes commented Dec 1, 2015

anthmgoogle commented Dec 1, 2015

dhermes commented Dec 1, 2015

anthmgoogle commented Dec 1, 2015

dhermes commented Dec 1, 2015

anthmgoogle commented Dec 1, 2015

dhermes commented Dec 1, 2015

dhermes commented Dec 4, 2015

saicheems commented Dec 7, 2015

Choose a reason for hiding this comment

dhermes commented Dec 14, 2015