TensorFlow Transformer Part-3 #10

phi-dbq · 2017-09-19T21:23:11Z

Introduces TFInputGraph and tests
TFInputGraph is used as the internal storage of the TensorFlow graph for the TFTransformer.

codecov-io · 2017-09-19T22:33:24Z

Codecov Report

Merging #10 into tf-transformer-part2 will increase coverage by 0.87%.
The diff coverage is 97.59%.

@@                   Coverage Diff                    @@
##           tf-transformer-part2      #10      +/-   ##
========================================================
+ Coverage                 83.13%   84.01%   +0.87%     
========================================================
  Files                        24       25       +1     
  Lines                      1293     1376      +83     
  Branches                      5        5              
========================================================
+ Hits                       1075     1156      +81     
- Misses                      218      220       +2

Impacted Files	Coverage Δ
python/sparkdl/param/converters.py	`82.66% <ø> (ø)`	⬆️
python/sparkdl/graph/input.py	`97.59% <97.59%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63967b4...decdc8f. Read the comment docs.

sueann

we'll need to iterate a bit here to simplify the structure and logic - let's start with the comments here.

sueann · 2017-09-21T21:25:50Z

python/sparkdl/graph/input.py

+    @classmethod
+    def fromGraph(cls, graph, sess, feed_names, fetch_names):
+        """
+        Construct a TFInputGraphBuilder from a in memory tf.Graph object


nit: an in-memory

also, this returns a TFInputGraph not TFInputGraphBuilder right?

sueann · 2017-09-21T21:27:39Z

python/sparkdl/graph/input.py

+        Construct a TFInputGraphBuilder from a in memory tf.Graph object
+        """
+        assert isinstance(graph, tf.Graph), \
+            ('expect tf.Graph type but got', type(graph))


nit: expected

sueann · 2017-09-21T21:33:23Z

python/sparkdl/graph/input.py

+                                          feed_names=None, fetch_names=None)
+
+    @classmethod
+    def _from_checkpoint_impl(cls,


nit: really don't need _impl at the end. _from_checkpoint is clear enough. same for _from_saved_model_impl below.

names are changed as part of the refactoring

sueann · 2017-09-21T21:34:00Z

python/sparkdl/graph/input.py

+                              feed_names=None,
+                              fetch_names=None):
+        """
+        Construct a TFInputGraphBuilder from a model checkpoint


TFInputGraph

Changed doc and checked spelling with pylint + pyenchant

sueann · 2017-09-21T21:34:05Z

python/sparkdl/graph/input.py

+                               feed_names=None,
+                               fetch_names=None):
+        """
+        Construct a TFInputGraphBuilder from a SavedModel


TFInputGraph

Changed doc and checked spelling with pylint + pyenchant

sueann · 2017-09-21T21:35:00Z

python/sparkdl/graph/input.py

+    @classmethod
+    def fromCheckpointWithSignature(cls, checkpoint_dir, signature_def_key):
+        assert signature_def_key is not None
+        return cls._from_checkpoint_impl(checkpoint_dir,


style: the first two argument should fit in the first line

sueann · 2017-09-21T21:36:04Z

python/sparkdl/graph/input.py

+            self.fetch_mapping[sigdef_key] = tnsr_name
+            self.fetch_names.append(tnsr_name)
+
+class _GinBuilder(object):


this object is never used twice in our usage cases. let's make it a single function instead. then we don't have to worry about cleaning up state, e.g. the session.

Removed all classes and builder objects.

sueann · 2017-09-21T21:39:52Z

python/sparkdl/graph/input.py

+        return _GinBuilder(import_graph_fn).build(feed_names, fetch_names)
+
+
+class _GinBuilderInfo(object):


this seems to be really more like _SigDefInfo - let's name it more clearly.

Removed all classes and builder objects.

sueann · 2017-09-21T21:42:39Z

python/sparkdl/graph/input.py

+        # pylint: disable=protected-access,attribute-defined-outside-init
+        gin = TFInputGraph._new_obj_internal()
+        assert (feed_names is None) == (fetch_names is None)
+        must_have_sig_def = fetch_names is None


don't need this variable since it's only used once. just use if fetch_names is None where we use this variable below.

I think it is easier to reason with when assigned to a variable. For the time being fetch_name is None is the only requirement, but it might change in the future. And if it does change, one would only have to modify it here.

if it changes in the future we can use a variable.

Removed the boolean variable

sueann · 2017-09-21T21:44:44Z

python/sparkdl/graph/input.py

+            "Please do NOT construct TFInputGraph directly. Instead, use one of the helper functions")
+
+    @classmethod
+    def _new_obj_internal(cls):


safer to just have this take in all three member variables to set without None default values. the way we use it, we can definitely construct this way.

thunterdb · 2017-09-22T00:43:59Z

python/sparkdl/graph/input.py

+
+import sparkdl.graph.utils as tfx
+
+__all__ = ["TFInputGraph"]


I think this whole file has much more complexity that it really needs to, given the aim of its content.

We do not need a builder pattern, based on looking at the other PRs. You also happen to expose these builder classes implicitly because you return them, but this is not transparent from __all__. This is a limitation of python that does not help for software engineering.

Once we drop that, all this code really needs to expose is one immutable data structure and six functions that are stateless. As we discussed, there is no need to create additional classes, static methods or other python features. Everything here can be done with regular functions. Here is what I think it should look like:

TFInputGraph=namedtuple(["graph_def", "inputs", "outputs"]) """ A frozen representation of a tensorflow graph, and some extra information to map spark columns to the graph's input and output tensors. Users should not have to peek into its content, or make any assumption about the content. graph_def: a tensorflow GraphDef object inputs: a dictionary of {string: string} where the key is XXX and the value is XXX outputs: XXX """ def fromGraph(graph, sess, feed_names, fetch_names): """ Builds an internal representation of a tensorflow graph, based on a tf.Graph object :param graph: XXX :param sess: XXX :param feed_names: XXX <- put the details here! :param fetch_names: XXX <- put the details here! :return: a TFInputGraph object """ raise ... other public functions with full documentation ... all the other private functions. They can be lighter on documentation, but they should not require any sophisticated python features. No extra classes are required.

You should also take a look at named tuples for building them, but there you should absolutely not need any of the more advanced features for them (for the time being).
https://docs.python.org/2/library/collections.html#collections.namedtuple

This is a significant rewrite, but it is going to dramatically improve the readability and the correctness in the face of future changes.

Using classmethod to create class instances is a well-accepted approach. It is commonly used in CPython implementation of datetime.

Classes created from collections.namedtuple are self-documenting, resembling Scala's case classes to some degree.

>>> ABC = namedtuple('ABC', 'a, b, c', verbose=True) class ABC(tuple): 'ABC(a, b, c)' __slots__ = () _fields = ('a', 'b', 'c') def __new__(_cls, a, b, c): 'Create new instance of ABC(a, b, c)' return _tuple.__new__(_cls, (a, b, c)) @classmethod def _make(cls, iterable, new=tuple.__new__, len=len): 'Make a new ABC object from a sequence or iterable' result = new(cls, iterable) if len(result) != 3: raise TypeError('Expected 3 arguments, got %d' % len(result)) return result def __repr__(self): 'Return a nicely formatted representation string' return 'ABC(a=%r, b=%r, c=%r)' % self def _asdict(self): 'Return a new OrderedDict which maps field names to their values' return OrderedDict(zip(self._fields, self)) def _replace(_self, **kwds): 'Return a new ABC object replacing specified fields with new values' result = _self._make(map(kwds.pop, ('a', 'b', 'c'), _self)) if kwds: raise ValueError('Got unexpected field names: %r' % kwds.keys()) return result def __getnewargs__(self): 'Return self as a plain tuple. Used by copy and pickle.' return tuple(self) __dict__ = _property(_asdict) def __getstate__(self): 'Exclude the OrderedDict from pickling' pass a = _property(_itemgetter(0), doc='Alias for field number 0') b = _property(_itemgetter(1), doc='Alias for field number 1') c = _property(_itemgetter(2), doc='Alias for field number 2')

You can see that they provide a lot more functions than we actually need.
In addition, since the class inherit from tuple, the following should be noted.

>>> ABC = namedtuple('ABC', 'a, b, c'); isinstance(ABC(a=1, b=2, c=3), tuple) True

sueann

here are some initial comments on the test code flow

sueann · 2017-09-22T00:49:11Z

python/tests/graph/test_input_graph.py

+            ref_feed = tfx.get_tensor(graph, self.input_op_name)
+            ref_fetch = tfx.get_tensor(graph, self.output_op_name)
+
+            def check_input_graph(tgt_gdef, test_idx):


just put the internals of the function inside the for-loop below since the function is not used anywhere else

Well, it is another layer of indentation and it is hard to track which is aligned to which outside an editor.

sueann · 2017-09-22T00:50:35Z

python/tests/graph/test_input_graph.py

+            _ = tf.reduce_mean(x, axis=1, name=self.output_op_name)
+
+            gin = TFInputGraph.fromGraph(sess.graph, sess, self.feed_names, self.fetch_names)
+            self.input_graphs.append(gin)


let's just repurpose _run_test_in_tf_session to run the test directly inside of these functions without having to save the various graphs inside the test class. that way it's more clear what tests fail.

The testing function can be refactored in the same way that we did for PR-1
Let's see if we like that first and then we can apply the same changes here.

sueann · 2017-09-22T00:59:18Z

python/tests/graph/test_input_graph.py

+        builder = tf.saved_model.builder.SavedModelBuilder(saved_model_dir)
+
+        with self._run_test_in_tf_session() as sess:
+            # Model definition: begin


can all the tests here use the same graph built by calling a helper function? e.g.

def _build_graph(): g = tf.Graph() with g.as_default(): x = tf.placeholder(tf.float64, shape=[None, self.vec_size], name=self.input_op_name) w = tf.Variable(tf.random_normal([self.vec_size], dtype=tf.float64), dtype=tf.float64, name='varW') z = tf.reduce_mean(x * w, axis=1, name=self.output_op_name) return g ... def test_build_...(): graph = _build_graph(); with tf.Session(graph=graph) as sess: ...

Well, this is essentially what we are doing here, isn't it?
Testing functions also serve as example/documentation. Here having the whole workflow in the same place makes it easy for users to understand how to use our library.

thunterdb · 2017-09-25T19:06:12Z

python/sparkdl/graph/input.py

+    .. warning: This class should not be called by any user code.
+    """
+
+    def __init__(self):


def __init__(self, graph_def, ...) self.graph_def = graph_def ....

thunterdb · 2017-09-25T19:06:47Z

python/sparkdl/graph/input.py

+    An opaque object containing TensorFlow graph.
+    This object can be serialized.
+
+    .. warning: This class should not be called by any user code.


document fields here (mentioning they are implementation details that should be not be relied on by users).

Expanded documentation.

phi-dbq · 2017-09-26T21:38:15Z

TFInputGraph impl refactor is done. Tests changes pending on previous PR reviews.
@thunterdb @sueann

sueann

This looks much more readable! I haven't reviewed the large docstring yet but have reviewed the code so that part is ready for you. Could you put a screenshot of the docs (at least for TFInputGraph) here for easier review (of the formatting etc).

sueann · 2017-09-28T00:55:25Z

python/sparkdl/graph/input.py

+    """
+
+
+    def __init__(self, graph_def, input_tensor_name_from_signature,


i wish we had better names for these maps... naming is so hard. sig_key_to_input_tensor_names, sig_key_to_output_tensor_names ? the problem is, "_from_" generally doesn't hint that it's a map.

I agree. Naming things is difficult, especially in this case.
TF does not seem to have a proper name for "well_known_input_sig" yet.
And signature_def_key refers to "well_known_prediction_signature" in our example.
Maybe input_signature_to_tensor_name and output_signature_to_tensor_name?

Yeah i think those are actually good, since it conveys the meaning that it's mapping from some signature thing to tensor, and really we don't need the user to know exactly what the keys are -- users aren't expected to use these directly.

As long as the conversion methods (in part 4) make it clear what their inputs are.

Since this is an internal variable that users should not access directly, let's keep is as it is.

sueann · 2017-09-28T00:56:59Z

python/sparkdl/graph/input.py

+
+             with tf.Session() as sess:
+                  graph = import_my_tensorflow_graph(...)
+                  TFInputGraph.fromGraph(graph, sess, ...)


input = TFInputGraph.fromGraph(graph, sess, ...)

sueann · 2017-09-28T00:59:01Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


what's different if you return here instead of outside the session block? are there implications for the session? if it's the same, it'd be better to return here for simplicity.

Outside the with tf.Session(graph=graph) as sess context manager block, the session will be closed.
In this case _build_with_feeds_fetches will fail.

Although if we return inside the block, it is unclear to me if the session will be closed after the return statement.

Looks like with tf.Session(graph=graph) as sess: ... return is equivalent to something like

try: sess = tf.Session(graph=graph).__enter__() ... return ... finally: tf.Session.__exit__()

(not 100% sure about the exact syntax for the context manager calling __enter__ and __exit__)
so it should be okay to return inside. Essentially all the bookkeeping done by the with statement still happens.

https://www.python.org/dev/peps/pep-0343/

+1, return inside.

I am fine with returning from inside. But I don't think there are any substantial differences for returning inside v.s. outside.

Let's change that, this is the style of this project.

From readability's perspective, returning outside signals that the returned value does not need the resource held by the context manager to operate.
For a user who is unaware of the way context managers work, "return-outside" provides makes our intention clear to him/her.

sueann · 2017-09-28T01:02:26Z

python/sparkdl/graph/input.py

+
+def _build_with_feeds_fetches(sess, graph, feed_names, fetch_names):
+    # pylint: disable=protected-access,attribute-defined-outside-init
+    assert (feed_names is not None) and (fetch_names is not None), \


I don't think this check is necessary, but if you must, I'd do it one by one so it's more obvious when the error is thrown:

assert (feed_names is not None), "must provide feed_names" assert (fetch_names is not None), "must provide fetch names"

also, is it bad if they are empty? if so, should check for that and modify the error msg above to say "must provide non-empty ..."

sueann · 2017-09-28T01:05:42Z

python/sparkdl/graph/input.py

+    def fromCheckpointWithSignature(cls, checkpoint_dir, signature_def_key):
+        """
+        Construct a TFInputGraph object from a checkpoint, using the embedded
+        signature_def. Throw error if we cannot find an entry with the `signature_def_key`


nit: an error

sueann · 2017-09-28T01:11:45Z

python/sparkdl/graph/input.py

+    Notice that one should either provide the `signature_def_key` or provide both
+    `feed_names` and `fetch_names`. Please set the unprovided values to None.
+
+    :param signature_def_key: str, name of the mapping contained inside the `signature_def`


sueann · 2017-09-28T01:12:41Z

python/sparkdl/graph/input.py

+
+
+def _from_checkpoint_impl(checkpoint_dir, signature_def_key, feed_names, fetch_names):
+    """


honestly given the docs above we don't really need the docs for this function and _from_saved..._impl below. do these appear in the generated docs? not sure what we do for docs with "private" methods.

Methods whose name starts with '_' do not appear in the docs.

Ah, good to know, thanks for the clarification.

sueann · 2017-09-28T01:21:40Z

python/sparkdl/graph/input.py

+    :param fetch_names: list, names of the output tensors.
+    """
+    assert (feed_names is None) == (fetch_names is None), \
+        'feed_names and fetch_names, if provided must appear together'


"must appear together" -> "must be both non-None"

@phi-dbq did you miss this comment?

sueann · 2017-09-28T01:29:04Z

python/sparkdl/graph/input.py

+
+def _build_with_sig_def(sess, graph, sig_def):
+    # pylint: disable=protected-access,attribute-defined-outside-init
+    assert sig_def, \


this assumes how sig_def came about, which this function shouldn't care about. you can just say "sig_def must not be None." and people can debug from there.

sueann · 2017-09-28T01:31:26Z

python/sparkdl/graph/input.py

+            feed_mapping[sigdef_key] = tnsr_name
+            feed_names.append(tnsr_name)
+
+        fetch_mapping = {}


we should add some tests that specifically check that these mappings are created correctly.

THis will make things easier when we want to extend other base class functions.

Signed-off-by: Philip Yang <philip.yang@databricks.com>

phi-dbq

Mostly done, apart from adding test for mappings.

phi-dbq · 2017-10-03T02:05:11Z

python/sparkdl/graph/input.py

+
+
+def _from_checkpoint_impl(checkpoint_dir, signature_def_key, feed_names, fetch_names):
+    """


Methods whose name starts with '_' do not appear in the docs.

phi-dbq · 2017-10-03T02:34:59Z

python/sparkdl/graph/input.py

+    """
+
+
+    def __init__(self, graph_def, input_tensor_name_from_signature,


I agree. Naming things is difficult, especially in this case.
TF does not seem to have a proper name for "well_known_input_sig" yet.
And signature_def_key refers to "well_known_prediction_signature" in our example.
Maybe input_signature_to_tensor_name and output_signature_to_tensor_name?

phi-dbq · 2017-10-03T02:39:46Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


Outside the with tf.Session(graph=graph) as sess context manager block, the session will be closed.
In this case _build_with_feeds_fetches will fail.

phi-dbq · 2017-10-03T02:49:45Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


Although if we return inside the block, it is unclear to me if the session will be closed after the return statement.

thunterdb

@phi-dbq there are still some changes that would be good to do. I am happy to talk about them in person.

thunterdb · 2017-10-04T20:17:05Z

python/sparkdl/graph/input.py

+              - :py:meth:`fromSavedModelWithSignature`
+
+
+    When the graph contains serving signatures in which a set of well-known names are associtated


thunterdb · 2017-10-04T20:20:14Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


+1, return inside.

thunterdb · 2017-10-04T20:23:01Z

python/sparkdl/graph/input.py

+        :param signature_def_key: str, key (name) of the signature_def to use. It should be in
+                                  the list of `signature_def` structures saved with the checkpoint.
+        """
+        assert signature_def_key is not None


why are you checking for some parameters but not some others? This is python, there is only so much you can do.

We want to be vocal about signature_def_key must not be empty.

thunterdb · 2017-10-04T20:25:31Z

python/sparkdl/graph/input.py

+
+        if signature_def_key is not None:
+            sig_def = meta_graph_def.signature_def[signature_def_key]
+            gin = _build_with_sig_def(sess=sess, graph=graph, sig_def=sig_def)


thunterdb · 2017-10-04T20:25:37Z

python/sparkdl/graph/input.py

+            sig_def = meta_graph_def.signature_def[signature_def_key]
+            gin = _build_with_sig_def(sess=sess, graph=graph, sig_def=sig_def)
+        else:
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


thunterdb · 2017-10-04T20:33:23Z

python/sparkdl/graph/input.py

+            feed_mapping[sigdef_key] = tnsr_name
+            feed_names.append(tnsr_name)
+
+        # TODO: IN-THIS-PR, test if these mappings are constructed correctly.


noting the TODO.

there is a (partial) test for that, let's remove the TODO.

thunterdb · 2017-10-04T23:35:16Z

python/sparkdl/param/converters.py

@@ -167,7 +167,14 @@ def _check_is_tensor_name(_maybe_tnsr_name):
        raise TypeError(err_msg.format(type(_maybe_tnsr_name)))

    # The check is taken from TensorFlow's NodeDef protocol buffer.
-    # https://github.com/tensorflow/tensorflow/blob/r1.3/tensorflow/core/framework/node_def.proto#L21-L25
+    #   Each input is "node:src_output" with "node" being a string name and


yes, good point. In practice, most nodes have only one output, so I have not been too concerned about multiple outputs.

Ah, that was copied-and-pasted from comments inTensorFlow node_def.

thunterdb · 2017-10-05T21:40:53Z

python/tests/graph/test_input_graph.py

+#========================================================================
+# Don't have to modify the content below
+
+_TEST_CASES_GENERATORS = []


this is too complicated, and I do not see some reasons that would warrant such complexity. Please write some normal tests, with parametrized for example.

thunterdb · 2017-10-05T21:42:56Z

python/tests/graph/base_test_generators.py

+            self.output_mapping = {self.output_op_name: self.output_col}
+            self.fetch_names = [self.output_op_name + ':0']
+
+    @contextmanager


this is too complicated and if something goes wrong, you will be hard pressed to understand what is going on.

You should really generate some data, transform this data and compare it against some expected output, and clean up some stuff after that if necessary.

In fact, that is exactly what this class does. I checked randomly killing the tests and made sure that the stack traces are meaningful.

thunterdb · 2017-10-05T21:47:00Z

python/tests/graph/base_test_generators.py

+class TestGenBase(object):
+    def __init__(self, vec_size=17, test_batch_size=231):
+        # Testing data spec
+        self.vec_size = vec_size


this class has an enormous amount of state and moving parts, I cannot keep track what is happening. Please encapsulate what you need to pass to the relevant pieces.

phi-dbq

@thunterdb Let's walk through the testing code once you are back.
The initial version of the testing code was written to directly construct examples and run tests. But the complexity went up quickly once more testing examples were needed.
This version strived to keep the graph construction and comparing numerical results separately, making the code more concise while still allow new example graphs to be added easily.
In the next PR, we show how one could inherit this class and change the testing logic, while still being able to use existing examples.

phi-dbq · 2017-10-05T23:42:12Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


I am fine with returning from inside. But I don't think there are any substantial differences for returning inside v.s. outside.

phi-dbq · 2017-10-05T23:43:20Z

python/sparkdl/graph/input.py

+        :param signature_def_key: str, key (name) of the signature_def to use. It should be in
+                                  the list of `signature_def` structures saved with the checkpoint.
+        """
+        assert signature_def_key is not None


We want to be vocal about signature_def_key must not be empty.

phi-dbq · 2017-10-05T23:48:32Z

python/tests/graph/base_test_generators.py

+            self.output_mapping = {self.output_op_name: self.output_col}
+            self.fetch_names = [self.output_op_name + ':0']
+
+    @contextmanager


In fact, that is exactly what this class does. I checked randomly killing the tests and made sure that the stack traces are meaningful.

phi-dbq · 2017-10-05T23:53:50Z

python/sparkdl/param/converters.py

@@ -167,7 +167,14 @@ def _check_is_tensor_name(_maybe_tnsr_name):
        raise TypeError(err_msg.format(type(_maybe_tnsr_name)))

    # The check is taken from TensorFlow's NodeDef protocol buffer.
-    # https://github.com/tensorflow/tensorflow/blob/r1.3/tensorflow/core/framework/node_def.proto#L21-L25
+    #   Each input is "node:src_output" with "node" being a string name and


Ah, that was copied-and-pasted from comments inTensorFlow node_def.

phi-dbq · 2017-10-11T23:45:31Z

@thunterdb @sueann I think the refactoring of testing infrastructure is outside the scope of this task/PR. I would recommend moving that to another task/PR.

* profiling * tests * renamed test * removed original tests * removed the profiler utils * fixes indents * imports * added some tests * added test * fix test * one more test

thunterdb

@phi-dbq still a few small comments from the previous review.

thunterdb · 2017-11-21T20:54:08Z

python/sparkdl/graph/input.py

+    """
+
+
+    def __init__(self, graph_def, input_tensor_name_from_signature,


Since this is an internal variable that users should not access directly, let's keep is as it is.

thunterdb · 2017-11-21T20:56:54Z

python/sparkdl/graph/input.py

+        graph = tf.Graph()
+        with tf.Session(graph=graph) as sess:
+            tf.import_graph_def(graph_def, name='')
+            gin = _build_with_feeds_fetches(sess=sess, graph=graph, feed_names=feed_names,


Let's change that, this is the style of this project.

thunterdb · 2017-11-21T20:57:20Z

python/sparkdl/graph/input.py

+
+
+def _from_checkpoint_impl(checkpoint_dir, signature_def_key, feed_names, fetch_names):
+    """


Ah, good to know, thanks for the clarification.

thunterdb · 2017-11-21T20:58:33Z

python/sparkdl/graph/input.py

+            feed_mapping[sigdef_key] = tnsr_name
+            feed_names.append(tnsr_name)
+
+        # TODO: IN-THIS-PR, test if these mappings are constructed correctly.


there is a (partial) test for that, let's remove the TODO.

thunterdb · 2017-11-22T00:55:52Z

python/sparkdl/graph/input.py

+        :param signature_def_key: str, key (name) of the signature_def to use. It should be in
+                                  the list of `signature_def` structures saved with the checkpoint.
+        """
+        assert signature_def_key is not None


phi-dbq · 2017-11-22T05:08:52Z

@thunterdb I think I addressed your comments. Would you like to take another look? Thanks!

thunterdb

Looks good to me

* flat param API impl * support input graph scenarios * (WIP) new interface implementation * docs and cleanup * using tensorflow API instead of our utilities * automatic type conversion * cleanup * PR comments 1. Move `InputGraph` to its module. * (WIP) address comments * (WIP) respond to PR comments * test refactor * (wip) consolidating params * rebase upstream * import params fix * (wip) TFInputGraph impl * (wip) moving to new API * (wip) enable saved_model tests * (wip) enable checkpoint test * (wip) enable multiple tensor tests * enable all tests * optimize graph for inference * allows setting TFInputGraph * utilize test_input_graph for transformer tests * enable all tests Signed-off-by: Philip Yang <philip.yang@databricks.com> * input graph * docs * tensor tests * tensor test update * TFTransformer Part-4 Test Refactor (#15) * adding new tests * remove original test design * cleanup * deleting original testing ideas * PR comments

* update utils * tests * fix style Using the following YAPF style ======================================================== based_on_style = pep8 ALIGN_CLOSING_BRACKET_WITH_VISUAL_INDENT=True BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF=False COLUMN_LIMIT=100 SPACE_BETWEEN_ENDING_COMMA_AND_CLOSING_BRACKET=False SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED=True SPLIT_BEFORE_FIRST_ARGUMENT=False SPLIT_BEFORE_NAMED_ASSIGNS=False SPLIT_PENALTY_AFTER_OPENING_BRACKET=30 USE_TABS=False ======================================================== * refactoring tfx API * test refactoring * PR comments 1. docs in graph/utils.py * (wip) utils test * a few more tests for utils * test update cont'd * PR comments * PR comments * PR comments * TensorFlow Transformer Part-3 (#10) * intro: TFInputGraph * tests * Merge branch 'tf-transformer-part1' into tf-transformer-part3 * and so there is no helper classes * and into more pieces * class & docs * update docs * refactoring tfx API * update tfx utils usage * one way to build these tests * tests refactored * test cases in a single class THis will make things easier when we want to extend other base class functions. * shuffle things around Signed-off-by: Philip Yang <philip.yang@databricks.com> * docs mostly * yapf'd * consolidate tempdir creation * (wip) PR comments * more tests * change test generator module name * TFTransformer Part-3 Test Refactor (#14) * profiling * tests * renamed test * removed original tests * removed the profiler utils * fixes indents * imports * added some tests * added test * fix test * one more test * PR comments * TensorFlow Transformer Part-4 (#11) * flat param API impl * support input graph scenarios * (WIP) new interface implementation * docs and cleanup * using tensorflow API instead of our utilities * automatic type conversion * cleanup * PR comments 1. Move `InputGraph` to its module. * (WIP) address comments * (WIP) respond to PR comments * test refactor * (wip) consolidating params * rebase upstream * import params fix * (wip) TFInputGraph impl * (wip) moving to new API * (wip) enable saved_model tests * (wip) enable checkpoint test * (wip) enable multiple tensor tests * enable all tests * optimize graph for inference * allows setting TFInputGraph * utilize test_input_graph for transformer tests * enable all tests Signed-off-by: Philip Yang <philip.yang@databricks.com> * input graph * docs * tensor tests * tensor test update * TFTransformer Part-4 Test Refactor (#15) * adding new tests * remove original test design * cleanup * deleting original testing ideas * PR comments

phi-dbq added 2 commits September 19, 2017 14:22

intro: TFInputGraph

f4d938c

tests

cd3aa8d

phi-dbq force-pushed the tf-transformer-part3 branch from e1593c4 to cd3aa8d Compare September 19, 2017 22:24

sueann suggested changes Sep 21, 2017

View reviewed changes

thunterdb reviewed Sep 22, 2017

View reviewed changes

sueann suggested changes Sep 22, 2017

View reviewed changes

phi-dbq added 4 commits September 22, 2017 18:41

Merge branch 'tf-transformer-part1' into tf-transformer-part3

2e8f7a1

Merge branch 'tf-transformer-part1' into tf-transformer-part3

e09027f

and so there is no helper classes

40caace

and into more pieces

e963d11

phi-dbq force-pushed the tf-transformer-part3 branch from 342aab4 to e963d11 Compare September 23, 2017 04:39

thunterdb reviewed Sep 25, 2017

View reviewed changes

phi-dbq added 2 commits September 25, 2017 14:22

class & docs

ce60629

update docs

e0cf2ff

sueann reviewed Sep 28, 2017

View reviewed changes

phi-dbq added 2 commits September 29, 2017 00:19

refactoring tfx API

202e7ea

Merge branch 'tf-transformer-part2' into tf-transformer-part3

faf8cdd

phi-dbq force-pushed the tf-transformer-part2 branch from 202e7ea to ead1ed6 Compare September 29, 2017 16:48

phi-dbq added 9 commits September 29, 2017 13:42

Merge branch 'tf-transformer-part2' into tf-transformer-part3

cf72beb

update tfx utils usage

20e2dbc

one way to build these tests

20a5346

tests refactored

cf64708

test cases in a single class

c3b3a86

THis will make things easier when we want to extend other base class functions.

shuffle things around

e47060f

Signed-off-by: Philip Yang <philip.yang@databricks.com>

docs mostly

4e8f4e3

yapf'd

eaa5fa0

consolidate tempdir creation

43d6583

phi-dbq added 3 commits October 2, 2017 19:01

Merge branch 'tf-transformer-part2' into tf-transformer-part3

ee3acf1

Merge branch 'tf-transformer-part1' into tf-transformer-part3

7f16396

(wip) PR comments

f5107ad

phi-dbq commented Oct 3, 2017

View reviewed changes

phi-dbq added 3 commits October 3, 2017 12:14

more tests

ac681b0

change test generator module name

4d173c5

Merge branch 'tf-transformer-part2' into tf-transformer-part3

707697d

phi-dbq force-pushed the tf-transformer-part2 branch from 26c8f24 to d729528 Compare October 4, 2017 01:48

Merge branch 'tf-transformer-part2' into tf-transformer-part3

fe719b2

thunterdb suggested changes Oct 5, 2017

View reviewed changes

phi-dbq commented Oct 6, 2017

View reviewed changes

phi-dbq added this to the release 0.2 milestone Oct 12, 2017

TFTransformer Part-3 Test Refactor (#14)

a39b6d3

* profiling * tests * renamed test * removed original tests * removed the profiler utils * fixes indents * imports * added some tests * added test * fix test * one more test

phi-dbq mentioned this pull request Nov 18, 2017

[Holistic] TFTransformer databricks/spark-deep-learning#78

Closed

thunterdb suggested changes Nov 22, 2017

View reviewed changes

PR comments

decdc8f

thunterdb approved these changes Nov 22, 2017

View reviewed changes

phi-dbq merged commit de088a9 into tf-transformer-part2 Nov 22, 2017

		return _GinBuilder(import_graph_fn).build(feed_names, fetch_names)


		class _GinBuilderInfo(object):

		"""


		def __init__(self, graph_def, input_tensor_name_from_signature,



		def _from_checkpoint_impl(checkpoint_dir, signature_def_key, feed_names, fetch_names):
		"""

		- :py:meth:`fromSavedModelWithSignature`


		When the graph contains serving signatures in which a set of well-known names are associtated

TensorFlow Transformer Part-3 #10

TensorFlow Transformer Part-3 #10

Uh oh!

Conversation

phi-dbq commented Sep 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Sep 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sueann left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sueann left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

phi-dbq commented Sep 19, 2017 •

edited

Loading

codecov-io commented Sep 19, 2017 •

edited

Loading