TFTransformer Part-3 Test Refactor #14

thunterdb · 2017-11-09T12:41:02Z

@phi-dbq here is an alternative architecture for the tests. It does not exactly match all the tests that you wrote, but it should cover the all the important use cases that you outlined.

Some points to note:

I need to change the style to respect PEP-8 (indentations in particular), no need to mention it
these tests do not share state, and they are fully deterministic.
use of integer variables to ensure that there is no spurious rounding point error for simplicity

Please comment on the PR, so that we can merge it into your branch.

codecov-io · 2017-11-09T13:09:59Z

Codecov Report

Merging #14 into tf-transformer-part3 will decrease coverage by 0.14%.
The diff coverage is n/a.

@@                   Coverage Diff                    @@
##           tf-transformer-part3      #14      +/-   ##
========================================================
- Coverage                 84.19%   84.04%   -0.15%     
========================================================
  Files                        25       25              
  Lines                      1379     1379              
  Branches                      5        5              
========================================================
- Hits                       1161     1159       -2     
- Misses                      218      220       +2

Impacted Files	Coverage Δ
python/sparkdl/graph/input.py	`97.67% <0%> (-2.33%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe719b2...6f9fcba. Read the comment docs.

phi-dbq · 2017-11-14T22:09:09Z

Thanks, @thunterdb!

I haven't got the opportunity to take a deep look.
One thing I am wondering is that, if a new developer wants to add a new test for a new graph, (e.g. a TensorFlow model that contains Assets), what does she or he have to do?

thunterdb · 2017-11-16T19:27:10Z

@phi-dbq good point about assets, I did not know about these. I agree that they should be covered eventually. For now though, the models that we have seen in action do not have these, and we can add tests eventually that are specific to asset-laden models.

phi-dbq

Some initial comments.

phi-dbq · 2017-11-16T21:20:19Z

python/tests/graph/test_import.py

+from sparkdl.graph.input import TFInputGraph
+
+
+class TestGraphImport(object):


Shall we use unittest.TestCase so as to be compatible with the rest of the code base. Thanks!

phi-dbq · 2017-11-16T21:24:13Z

python/tests/graph/test_import.py

+
+    It takes a vector input, with vec_size columns and returns an int32 scalar.
+    """
+    x = tf.placeholder(tf.int32, shape=[_tensor_size], name=_tensor_input_name)


(Context: test requirements)
As we have discussed earlier in a TensorFrames PR, we need to test graphs with float Tensors.
This is because Spark uses double for vector types and TensorFlow is very strict on Tensor typing.
Mismatched types will likely end up with pages of Catalyst error info and it is hard to debug.

phi-dbq · 2017-11-16T21:28:06Z

python/tests/graph/test_import.py

+    It takes a vector input, with vec_size columns and returns an int32 scalar.
+    """
+    x = tf.placeholder(tf.int32, shape=[_tensor_size], name=_tensor_input_name)
+    _ = tf.reduce_max(x, name=_tensor_output_name)


We may also need to test more complex graphs to make sure that floating point computation won't differ too much.
There have been cases where people have spent a lot of time trying to figure out when different frameworks give different results for the same deep learning model.

phi-dbq · 2017-11-16T21:29:40Z

python/tests/graph/test_import.py

+_serving_tag = "serving_tag"
+_serving_sigdef_key = 'prediction_signature'
+# The name of the input tensor
+_tensor_input_name = "input_tensor"


(Context: test requirements)
As Andy Feng has commented before, we need to test support for graphs with multiple inputs and multiple outputs.

Here is the original comment
databricks#39 (comment)
We ended up implementing support for multi-input-multi-output graphs in TFInputGraph.

phi-dbq · 2017-11-17T17:51:19Z

python/tests/graph/test_import.py

+    builder.save()
+
+
+@contextlib.contextmanager


(Context: testing infra)
We should in general use fixtures for handling resources.

phi-dbq · 2017-11-17T19:05:18Z

python/tests/graph/test_import.py

+    finally:
+        shutil.rmtree(temp_dir)
+
+def _build_graph_input(gin_function):


(Context: design)
I am wondering, would we call this pattern dependency injection?

(sync'ed offline)
This pattern is used both in the proposed change and in the original test framework, though in the latter case it was done as a context manager.
We both agreed that this particular pattern makes sufficient amount of code to be shared and that's good.

phi-dbq · 2017-11-17T19:24:54Z

python/tests/graph/test_import.py

+_tensor_size = 3
+
+
+def _build_checkpointed_model(session, tmp_dir):


(Context: design)
We actually modify the graph in the session by assigning signatures to it.
I am just trying to see if this is okay in terms of immutability.

(sync'ed offline)
It is okay for mutating tensorflow session/graph, because we cannot modify the library we are using.
Our goal is to try to limit the amount of mutability on our side.

phi-dbq · 2017-11-17T19:28:22Z

python/tests/graph/test_import.py

+            gin = _build_graph_input_var(gin_fun)
+            _check_input_novar(gin)
+
+    def test_checkpoint_nosig_var(self):


(Context: test requirements)
This is not needed. Checkpoints are only useful for training, where the model contains variables.

phi-dbq · 2017-11-17T19:29:20Z

python/tests/graph/test_import.py

+            _check_input_novar(gin)
+
+    # TODO: we probably do not need this test
+    def test_saved_graph_novar(self):


(Context: test requirements)
This is actually useful. SavedModels are designed for inference/serving.
It should be named test_saved_graph_novar.

phi-dbq

Some additional comments on the design.

phi-dbq · 2017-11-17T19:40:35Z

python/tests/graph/test_import.py

+_tensor_size = 3
+
+
+def _build_checkpointed_model(session, tmp_dir):


(sync'ed offline)
It is okay for mutating tensorflow session/graph, because we cannot modify the library we are using.
Our goal is to try to limit the amount of mutability on our side.

phi-dbq · 2017-11-17T19:59:05Z

python/tests/graph/base_test_generators.py

-            gin = TFInputGraph.fromGraph(sess.graph, sess, self.feed_names, self.fetch_names)
-            self.register(gin=gin, description='saved model with graph')
-
-    def build_from_checkpoint(self):


(From offline discussion)
@thunterdb commented that the tests are trying to satisfy more requirements than are needed.
To be specific, I was trying to design the test framework to test multiple different TensorFlow graphs, which is an unnecessary requirement.

phi-dbq · 2017-11-17T21:22:45Z

python/tests/graph/test_import.py

+_serving_tag = "serving_tag"
+_serving_sigdef_key = 'prediction_signature'
+# The name of the input tensor
+_tensor_input_name = "input_tensor"


Here is the original comment
databricks#39 (comment)
We ended up implementing support for multi-input-multi-output graphs in TFInputGraph.

phi-dbq · 2017-11-17T21:54:16Z

python/tests/graph/test_import.py

+    finally:
+        shutil.rmtree(temp_dir)
+
+def _build_graph_input(gin_function):


(sync'ed offline)
This pattern is used both in the proposed change and in the original test framework, though in the latter case it was done as a context manager.
We both agreed that this particular pattern makes sufficient amount of code to be shared and that's good.

phi-dbq · 2017-11-17T22:06:20Z

python/tests/graph/base_test_generators.py

-        # The basic stage contains the opaque :py:obj:`TFInputGraph` objects
-        # Any derived that override the :py:obj:`build_input_graphs` method will
-        # populate this field.
-        self.input_graphs = []


(sync'ed offline)
@thunterdb mentioned that the main complaints were that the fields input_graps and test_cases are modified during test case construction, and that might hurt the code's readability.
This is, unfortunately, a walk-around due to the limit on Python's unittest framework.

phi-dbq · 2017-11-17T22:16:34Z

python/tests/graph/test_import.py

+                                                           [_tensor_input_name], [_tensor_output_name]))
+        _check_input_novar(gin)
+
+    def test_saved_model_novar(self):


(Context: design)
One issue with the current design is that, in order to test two types of graphs, we essentially doubled the amount of code (and actually missed test_graphdef_var, test_graph_var, etc.).

phi-dbq · 2017-11-17T22:30:06Z

python/tests/graph/test_import.py

+    _ = tf.reduce_max(x, name=_tensor_output_name)
+
+
+def _build_graph_var(session):


(Context: design)
Readability-wise, a user have to understand that the workflow
=> _build_graph_xxx (the original TensorFlow graph)
=> _build_graph_input_xxx (where the graph is set under an active TensorFlow session)

=> gin_function (specified in each individual test case)

=> where _build_checkpointed_model or _build_saved_model would inject additional data to the graph.

=> _check_input_xxx (provides input and expected output for the TensorFlow graph)
=> _check_output_xxx (execute the TensorFlow graph and check the result)

As core library developers, we understand this (took some effort for me). I am not sure this is easy to understand for other contributors.

(Context: design)
In addition, each graph type represented by _build_graph_xxx has to get all the functions ended with _xxx replicated.
Although, as we have discussed, given the time constraints, these are deliberately not considered non-essential.

phi-dbq · 2017-11-18T05:54:25Z

We will merge both #14 and #15 and then merge the whole TFTransformer PR series.

phi-dbq · 2017-11-18T16:48:44Z

As @thunterdb and I have discussed, I will merge this to my branch.

* intro: TFInputGraph * tests * Merge branch 'tf-transformer-part1' into tf-transformer-part3 * and so there is no helper classes * and into more pieces * class & docs * update docs * refactoring tfx API * update tfx utils usage * one way to build these tests * tests refactored * test cases in a single class THis will make things easier when we want to extend other base class functions. * shuffle things around Signed-off-by: Philip Yang <philip.yang@databricks.com> * docs mostly * yapf'd * consolidate tempdir creation * (wip) PR comments * more tests * change test generator module name * TFTransformer Part-3 Test Refactor (#14) * profiling * tests * renamed test * removed original tests * removed the profiler utils * fixes indents * imports * added some tests * added test * fix test * one more test * PR comments * TensorFlow Transformer Part-4 (#11) * flat param API impl * support input graph scenarios * (WIP) new interface implementation * docs and cleanup * using tensorflow API instead of our utilities * automatic type conversion * cleanup * PR comments 1. Move `InputGraph` to its module. * (WIP) address comments * (WIP) respond to PR comments * test refactor * (wip) consolidating params * rebase upstream * import params fix * (wip) TFInputGraph impl * (wip) moving to new API * (wip) enable saved_model tests * (wip) enable checkpoint test * (wip) enable multiple tensor tests * enable all tests * optimize graph for inference * allows setting TFInputGraph * utilize test_input_graph for transformer tests * enable all tests Signed-off-by: Philip Yang <philip.yang@databricks.com> * input graph * docs * tensor tests * tensor test update * TFTransformer Part-4 Test Refactor (#15) * adding new tests * remove original test design * cleanup * deleting original testing ideas * PR comments

* update utils * tests * fix style Using the following YAPF style ======================================================== based_on_style = pep8 ALIGN_CLOSING_BRACKET_WITH_VISUAL_INDENT=True BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF=False COLUMN_LIMIT=100 SPACE_BETWEEN_ENDING_COMMA_AND_CLOSING_BRACKET=False SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED=True SPLIT_BEFORE_FIRST_ARGUMENT=False SPLIT_BEFORE_NAMED_ASSIGNS=False SPLIT_PENALTY_AFTER_OPENING_BRACKET=30 USE_TABS=False ======================================================== * refactoring tfx API * test refactoring * PR comments 1. docs in graph/utils.py * (wip) utils test * a few more tests for utils * test update cont'd * PR comments * PR comments * PR comments * TensorFlow Transformer Part-3 (#10) * intro: TFInputGraph * tests * Merge branch 'tf-transformer-part1' into tf-transformer-part3 * and so there is no helper classes * and into more pieces * class & docs * update docs * refactoring tfx API * update tfx utils usage * one way to build these tests * tests refactored * test cases in a single class THis will make things easier when we want to extend other base class functions. * shuffle things around Signed-off-by: Philip Yang <philip.yang@databricks.com> * docs mostly * yapf'd * consolidate tempdir creation * (wip) PR comments * more tests * change test generator module name * TFTransformer Part-3 Test Refactor (#14) * profiling * tests * renamed test * removed original tests * removed the profiler utils * fixes indents * imports * added some tests * added test * fix test * one more test * PR comments * TensorFlow Transformer Part-4 (#11) * flat param API impl * support input graph scenarios * (WIP) new interface implementation * docs and cleanup * using tensorflow API instead of our utilities * automatic type conversion * cleanup * PR comments 1. Move `InputGraph` to its module. * (WIP) address comments * (WIP) respond to PR comments * test refactor * (wip) consolidating params * rebase upstream * import params fix * (wip) TFInputGraph impl * (wip) moving to new API * (wip) enable saved_model tests * (wip) enable checkpoint test * (wip) enable multiple tensor tests * enable all tests * optimize graph for inference * allows setting TFInputGraph * utilize test_input_graph for transformer tests * enable all tests Signed-off-by: Philip Yang <philip.yang@databricks.com> * input graph * docs * tensor tests * tensor test update * TFTransformer Part-4 Test Refactor (#15) * adding new tests * remove original test design * cleanup * deleting original testing ideas * PR comments

thunterdb added 5 commits November 3, 2017 17:14

profiling

03311ce

tests

7d9f550

renamed test

04147f1

removed original tests

d2a546b

removed the profiler utils

0eb56e7

thunterdb added 2 commits November 9, 2017 14:38

fixes indents

1a2efd9

imports

7650978

phi-dbq reviewed Nov 16, 2017

View reviewed changes

phi-dbq reviewed Nov 17, 2017

View reviewed changes

thunterdb added 4 commits November 17, 2017 16:10

added some tests

8915a02

added test

b912170

fix test

5185587

one more test

6f9fcba

phi-dbq changed the title ~~Tests reorganization~~ TFTransformer Part3 Test Refactor Nov 18, 2017

phi-dbq changed the title ~~TFTransformer Part3 Test Refactor~~ TFTransformer Part-3 Test Refactor Nov 18, 2017

phi-dbq mentioned this pull request Nov 18, 2017

TFTransformer Part-4 Test Refactor #15

Merged

phi-dbq merged commit a39b6d3 into phi-dbq:tf-transformer-part3 Nov 18, 2017

		from sparkdl.graph.input import TFInputGraph


		class TestGraphImport(object):

		_tensor_size = 3


		def _build_checkpointed_model(session, tmp_dir):

		_ = tf.reduce_max(x, name=_tensor_output_name)


		def _build_graph_var(session):

		builder.save()


		@contextlib.contextmanager

TFTransformer Part-3 Test Refactor #14

TFTransformer Part-3 Test Refactor #14

Uh oh!

Conversation

thunterdb commented Nov 9, 2017

Uh oh!

codecov-io commented Nov 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

phi-dbq commented Nov 14, 2017

Uh oh!

thunterdb commented Nov 16, 2017

Uh oh!

phi-dbq left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq Nov 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq Nov 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq Nov 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq Nov 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phi-dbq commented Nov 18, 2017

Uh oh!

phi-dbq commented Nov 18, 2017

Uh oh!

Uh oh!

codecov-io commented Nov 9, 2017 •

edited

Loading

phi-dbq Nov 16, 2017 •

edited

Loading

phi-dbq Nov 16, 2017 •

edited

Loading

phi-dbq Nov 17, 2017 •

edited

Loading

phi-dbq Nov 17, 2017 •

edited

Loading