Skip to content

Conversation

phi-dbq
Copy link
Contributor

@phi-dbq phi-dbq commented Sep 18, 2017

This is the first part of PRs from the TF Transformer API design POC PR.

It introduces parameters needed for the TFTransformer (minus those for TFInputGraph)
and corresponding type converters.

  • Introducing MLlib model Params for TFTransformer.
  • Type conversion utilities for the introduced MLlib Params used in Spark Deep Learning Pipelines.
    • We follow the convention of MLlib to name these utilities "converters", but most of them act as type checkers that return the argument if it is the desired type and raise TypeError otherwise.
    • We follow the practice that a type converter only returns a single type.

@phi-dbq phi-dbq changed the title TF Transformer Part-1: Params and Converters TensorFlow Graph Transformer Part-1: Params and Converters Sep 18, 2017
@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 6b2bc80 to 7f82e2d Compare September 18, 2017 22:22
@codecov-io
Copy link

codecov-io commented Sep 18, 2017

Codecov Report

Merging #49 into master will increase coverage by 2.81%.
The diff coverage is 93.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
+ Coverage   82.82%   85.64%   +2.81%     
==========================================
  Files          23       29       +6     
  Lines        1217     1651     +434     
  Branches        5       17      +12     
==========================================
+ Hits         1008     1414     +406     
- Misses        209      237      +28
Impacted Files Coverage Δ
python/sparkdl/param/image_params.py 74.28% <ø> (ø) ⬆️
python/sparkdl/transformers/keras_applications.py 85.71% <ø> (-6.13%) ⬇️
python/sparkdl/transformers/named_image.py 93.51% <ø> (ø) ⬆️
python/sparkdl/graph/utils.py 100% <100%> (+4.93%) ⬆️
python/sparkdl/transformers/keras_image.py 100% <100%> (ø) ⬆️
python/sparkdl/param/__init__.py 100% <100%> (ø) ⬆️
python/sparkdl/transformers/tf_image.py 95.49% <100%> (-0.08%) ⬇️
python/sparkdl/graph/builder.py 93.75% <100%> (+0.05%) ⬆️
python/sparkdl/transformers/tf_tensor.py 100% <100%> (ø)
python/sparkdl/graph/tensorframes_udf.py 90% <100%> (ø) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3f668d9...5cd03ad. Read the comment docs.

@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 7f82e2d to 323939a Compare September 18, 2017 22:27
@phi-dbq phi-dbq requested a review from sueann September 18, 2017 22:28
Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments below. Let me know if you have any questions. Thanks!

conv.asColumnToTensorNameMap(invalid)
conv.asTensorNameToColumnMap(invalid)

with self.assertRaises(TypeError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a separate test for this section. tests should be as small as possible so it's easy to identify what is failing without having to rerun the tests with prints etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, isn't this covered by the case [('a', 1), ('b', 2)] above already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this as well. See above

def test_invalid_input_mapping(self):
for invalid in [['a1', 'b2'], ('c3', 'd4'), [('a', 1), ('b', 2)],
{1: 'a', 2.0: 'b'}, {'a': 1, 'b': 2.0}]:
with self.assertRaises(TypeError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the test pass if just one of asColumnToTensorNameMap and asTensorNameToColumnMap fail? there are many cases being tested here and it's hard to figure out what exactly we're expecting. if both of them should fail, we should split them into two with self.assertRaises sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that. Please check the params_test.py module and see if you like the way these tests are added.


def __init__(self):
super(HasInputCol, self).__init__()
inputCol = Param(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the final verdict on the python style here? I think the original here was the one? i.e.

  blah = Function(abcdefg, hijklmnopqrstuv,
                  wxyzlmnopaodifjwoeirjawoijr)

cc @MrBago @jkbradley

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the style is inconsistent among the files in this PR so we should pick the style and make them consistent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with team that's the style we'll go with - please change the style for all the instances in the PR. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this with changed YAPF

for k, v in value.items():
try:
if is_key_tf_tensor:
_pair = (tfx.as_tensor_name(k), v)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we were supporting only tensor names not op names per the design doc. do we need this conversion logic in that case (if we support only tensor names)? it'd be better to support just the tensor name and avoid as much conversion logic as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to only allow strict tensor names.

Using the following YAPF style
========================================================
based_on_style = pep8
ALIGN_CLOSING_BRACKET_WITH_VISUAL_INDENT=True
BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF=False
COLUMN_LIMIT=100
SPACE_BETWEEN_ENDING_COMMA_AND_CLOSING_BRACKET=False
SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED=True
SPLIT_BEFORE_FIRST_ARGUMENT=False
SPLIT_BEFORE_NAMED_ASSIGNS=False
SPLIT_PENALTY_AFTER_OPENING_BRACKET=30
USE_TABS=False
========================================================
@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 7287ab7 to 4f11374 Compare September 21, 2017 22:00
@sueann
Copy link
Collaborator

sueann commented Sep 25, 2017

Is this ready for review @phi-dbq ?

Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments for you to address. The test code is a bit hard to parse, so will take a look again later...


@staticmethod
def asColumnToTensorNameMap(value):
return _try_convert_tf_tensor_mapping(value, is_key_tf_tensor=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_try_convert_tf_tensor_mapping gets used twice, once with is_key_tf_tensor=True and once with is_key_tf_tensor=False. This means there really should just be two different functions. Try writing out the logic directly in asColumnToTensorNameMap and asTensorNameToColumnMap. If there are small bits of code that are sharable you can have helper functions, but let's not even do that and see how it looks. I bet it is much shorter and simpler. It is okay to have some duplicated code if the duplicated code is not complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. You interleave condition checks for for is_key_tf_tensor 4 times in that function, it is best to split out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of refactoring, this helper function has gone away.

I wanted to combine them in the first place so that I do not have to duplicate the type checking logic.
It is always a trade-off. I hope that in this refactored version, the balance is tipped to our preference.

key-value object, storing parameters to be used to define the final
TensorFlow graph for the Transformer.
Currently accepted values are:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other values will just get ignored, not result in errors, right? We can say "Currently used values are: "...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark as change doc => will modify as the last step

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last step of this PR or in another PR? seems like a trivial change we can just make here intead of having to keep track of it for later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is any change takes a hour-long rebuild for Travis CI. It's reasonable to do code change first and then doc update when the code changes are accepted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phi-dbq as a general philosophy, code changes should always be coming with their documentation (unless there is a specific task to track the documentation) because wrong documentation is even worse than no documentation.

Copy link
Contributor

@thunterdb thunterdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phi-dbq thanks. I have a number of substantial comments that will probably require some reworking.

Here are also some high level comments:

  • do not do meta manipulations (__new__, class manipulation, etc.) They cannot be checked, they are hard to understand, and for what we do there is always an alternative
  • document your functions: what goes in, what comes out. This is critical in python, where we do not use type signatures. Also, if you cannot write succinct documentation, it is a sign that this function is not well defined.


__all__ = ['SparkDLTypeConverters']

def _get_strict_tensor_name(_maybe_tnsr_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put the public methods at the top. In general, the most important content should be at the top of the file, because this is where you start reading from.

raise TypeError("Could not convert %s to str to tf.Tensor name mapping" % type(value))


class SparkDLTypeConverters(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the need for putting all these methods in a class object and declaring them static: only this class is defined in the file, all these methods are internal, and they do different things (they are not all constructor methods for the class, for instance). There is no reason in my mind to put them in a class. The file can be renamed type_converters.py though to help understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has been there since we created this package.

class SparkDLTypeConverters(object):

The reason we had it in this format is to match the TypeConverter in Spark MLlib.
https://github.com/apache/spark/blob/master/python/pyspark/ml/param/__init__.py#L77

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I believe this thread https://stackoverflow.com/questions/2438473/what-is-the-advantage-of-using-static-methods-in-python provides good explanations as to why we might want to use static method sometimes.

# See the License for the specific language governing permissions and
# limitations under the License.
#

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a 3-line comments that explains what this file does. In general, it should always be the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark as change doc => will modify as the last step

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sounds good.

"input {} must be a valid tensor name".format(_maybe_tnsr_name)
return _maybe_tnsr_name

def _try_convert_tf_tensor_mapping(value, is_key_tf_tensor=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use default arguments in private functions unless there are some very compelling reasons. This is a user-facing feature for API design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a part of refactoring, this function has changed already.

In general, though, I would say having default arguments is quite useful at least for the following reasons.

  1. Serve as type annotation: in most IDEs, the signature is the only thing to show when browsing code.
  2. In certain cases, provide meaningful default settings.


class SparkDLTypeConverters(object):
@staticmethod
def toTFGraph(value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put docs for each of these methods. What are the inputs? What are the outputs? It is impossible in python to know that unless you follow through the code, and you will be hard pressed to know what these functions do in 6 months.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark as change doc => will modify as the last step

if isinstance(value, tf.Graph):
return value
else:
raise TypeError("Could not convert %s to TensorFlow Graph" % type(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tf.Graph to be more precise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standardized all the converter err msgs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark as change doc => will modify as the last step


@staticmethod
def asColumnToTensorNameMap(value):
return _try_convert_tf_tensor_mapping(value, is_key_tf_tensor=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. You interleave condition checks for for is_key_tf_tensor 4 times in that function, it is best to split out.

if kmutil.is_valid_optimizer(value):
return value
raise TypeError(
"Named optimizer not supported in Keras: {} type({})".format(value, type(value)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it is better to start with the type because it is very short and descriptive. The value itself my be a monster string and you want to put the most relevant info first.

from ..tests import PythonUnitTestCase


class TestGenMeta(type):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phi-dbq sorry, this file is beyond my cognitive load to comprehend.

If I am having a hard time to understand it now, I will definitely have an even harder time to diagnose why a test a failing. Please write down every test separately using a function. A test has multiple purposes, in order of priorities:

  1. provide through some examples the specification of a function
  2. when a modification is done subsequently, provide some assistance in diagnosing the failure
  3. and least important, test a specific implementation

This current file is probably right on the last point, but I am not able to verify the first point, and I am certain it will be not be of assistance for the second point. I am happy to discuss it more if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the test cases by using parameterized to generate individual test functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding methods to a test class programmatically is a must for our use case. The unit-test framework we use does not have good support for parameterized tests, leaving us no choice but to look externally.

Meta-classes are a supported feature in python from Guido's treatise.
In fact, we use it already in this codebase.
It might require some effort to parse, but they are natively supported, making the test infra easier to maintain.

On the other hand, using an external project might make code look cleaner, but it might also add considerable maintenance burden to us. Fortunately, the one we found, parameterized, is an actively maintained small project.

def __new__(mcs, name, bases, attrs):
_add_invalid_col2tnsr_mapping_tests()
attrs.update(_TEST_FUNCTIONS_REGISTRY)
return super(TestGenMeta, mcs).__new__(mcs, name, bases, attrs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not go meta, see my comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, thanks! Please see my response above.

"""
# pylint: disable=protected-access

@classmethod
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 00828b6 to fcabcb6 Compare September 25, 2017 23:34
Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the new test code. Will wait to review converters.py - let me know when that is ready. Thanks!

res = SparkDLTypeConverters.asTensorNameToColumnMap(valid_tnsr_output)
self.assertEqual(valid_output_mapping_result, res)

@parameterized.expand(_col2tnsr_test_cases)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is much more readable! one question for you -- if the test fails on one of the cases, what would the output look like? will i be able to quickly identify which test case failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :D
The test cases will be appended to the docstring of the base test function.
Here is what it looks like if a test fails.

============= Running the tests in: tests/param/params_test.py =============
Using TensorFlow backend.
Test invalid column name to tensor name mapping [with data=['a1', 'b2'], description='required pair but got single element'] ... ok
Test invalid column name to tensor name mapping [with data=('c3', 'd4'), description='required pair but got single element'] ... ok
Test invalid column name to tensor name mapping [with data=[('a', 1), ('b', 2)], description='only accept dict, but got list'] ... ok
Test invalid column name to tensor name mapping [with data={1: 'a', 2.0: 'b'}, description='wrong mapping type'] ... ok
Test invalid column name to tensor name mapping [with data={'a': 1.0, 'b': 2}, description='wrong mapping type'] ... ok
Test invalid column name to tensor name mapping [with data={'colB': 'tnsrOpB', 'colA': 'tnsrOpA'}, description='tensor name required'] ... ok
Test invalid column name to tensor name mapping [with data={'colB': 'tnsrOpB:1', 'colA': 'tnsrOpA:0'}, description='tensor name required'] ... FAIL
Test invalid tensor name to column name mapping [with data=['a1', 'b2'], description='required pair but got single element'] ... ok
Test invalid tensor name to column name mapping [with data=('c3', 'd4'), description='required pair but got single element'] ... ok
Test invalid tensor name to column name mapping [with data=[('a', 1), ('b', 2)], description='only accept dict, but got list'] ... ok
Test invalid tensor name to column name mapping [with data={1: 'a', 2.0: 'b'}, description='wrong mapping type'] ... ok
Test invalid tensor name to column name mapping [with data={'a': 1.0, 'b': 2}, description='wrong mapping type'] ... ok
Test invalid tensor name to column name mapping [with data={'tnsrOpB': 'colB', 'tnsrOpA': 'colA'}, description='tensor name required'] ... ok
Test valid input mapping conversion ... ok
Test valid output mapping conversion ... ok

======================================================================
FAIL: Test invalid column name to tensor name mapping [with data={'colB': 'tnsrOpB:1', 'colA': 'tnsrOpA:0'}, description='tensor name required']
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/parameterized/parameterized.py", line 392, in standalone_func
    return func(*(a + p.args), **p.kwargs)
  File "/Users/philipyang/CodeBase/spark-deep-learning/python/tests/param/params_test.py", line 71, in test_invalid_input_mapping
    SparkDLTypeConverters.asColumnToTensorNameMap(data)
AssertionError: TypeError not raised

----------------------------------------------------------------------
Ran 15 tests in 0.002s

]
_col2tnsr_test_cases = _shared_invalid_test_cases + [
TestCase(data={'colA': 'tnsrOpA', 'colB': 'tnsrOpB'},
description='strict tensor name required'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'tensor name required' should be good enough, since it's not clear what "strict tensor name" is to everyone.

]
_tnsr2col_test_cases = _shared_invalid_test_cases + [
TestCase(data={'tnsrOpA': 'colA', 'tnsrOpB': 'colB'},
description='strict tensor name required'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

TestCase = namedtuple('TestCase', ['data', 'description'])

_shared_invalid_test_cases = [
TestCase(data=['a1', 'b2'], description='required pair but get single element'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: get -> got

@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from dd168ed to 77b3906 Compare September 26, 2017 18:28
if isinstance(value, dict):
strs_pair_seq = []
for _maybe_col_name, _maybe_tnsr_name in value.items():
# Check if the non-tensor value is of string type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change doc ?

if isinstance(value, dict):
strs_pair_seq = []
for _maybe_tnsr_name, _maybe_col_name in value.items():
# Check if the non-tensor value is of string type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change doc?

@phi-dbq
Copy link
Contributor Author

phi-dbq commented Sep 26, 2017

I will make minor doc updates in the final step.

@phi-dbq phi-dbq requested a review from sueann September 26, 2017 19:17
@phi-dbq phi-dbq requested a review from thunterdb September 26, 2017 19:17
@staticmethod
def asColumnToTensorNameMap(value):
"""
Convert a value to a column name to :py:obj:`tf.Tensor` name mapping
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert the given value to a mapping, from a column name tf.Tensor name if possible. The result will be a list of string pairs, sorted by the key (column name).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ not sure if it's sorted by key or value, so please make it correct :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the context, it is sorted by key, but it is subtle indeed (lexicographic order on pairs)

Convert a value to a column name to :py:obj:`tf.Tensor` name mapping
as a sorted list of string pairs, if possible.
"""
if isinstance(value, dict):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer returning early in the function for clarify as well as less indentation, e.g.

  if not isinstance(...):
    raise TypeError(...)
  
  # more complex logic here

but it's treated as more of a personal preference currently here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with Sue Ann. These functions are trivial, but this is good practice in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. All the "converters" follow this convention.

strs_pair_seq = []
for _maybe_tnsr_name, _maybe_col_name in value.items():
# Check if the non-tensor value is of string type
_col_name = _get_strict_col_name(_maybe_col_name)
Copy link
Collaborator

@sueann sueann Sep 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment was for tensor_name below

Copy link
Contributor

@thunterdb thunterdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phi-dbq thank you for the changes, we are getting close.

As a general comment, the name of a function should reflect what it currently does, not what it may eventually do in the future. A function like this:

def toTurtle(obj):
  if isinstance(obj, Turtle):
    return obj
  raise ValueError("{} is not a turtle!".format(obj))

is not converting objects to Turtle, it is just checking that an object is not a turtle. I would call it checkTurtle.

private APIs.
"""

import textwrap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet another module I was not aware of.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that good or bad @thunterdb ? :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is amazing! Python comes with batteries, spare tires and driving maps included.

# See the License for the specific language governing permissions and
# limitations under the License.
#

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sounds good.

keras==2.0.4 # NOTE: this package has only been tested with keras 2.0.4 and may not work with other releases
nose>=1.3.7 # for testing
numpy>=1.11.2
parameterized>=0.6.1 # for testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet another model I did not know about, great finding and usage!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, Google uses something like this in their testing framework https://github.com/abseil/abseil-py/blob/master/absl/testing/parameterized.py


@staticmethod
def toTFHParams(value):
""" Convert a value to a :py:obj:`tf.contrib.training.HParams` object, if possible. """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not doing any conversion, this is just checking the cast.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this case is okay since it really is the convention in Spark ML... :-(


@staticmethod
def toStringOrTFTensor(value):
""" Convert a value to a str or a :py:obj:`tf.Tensor` object, if possible. """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not return different things, especially in python in which it is hard to track. From the way you are using it, it should always be a string.

@staticmethod
def supportedNameConverter(supportedList):
"""
Create a converter that try to check if a value is part of the supported list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not a converter, it does not convert anything. This function should be called something like buildCheckList

return converter

@staticmethod
def toKerasLoss(value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing, this is not a conversion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @thunterdb on that function naming should be precise. However, the entire Spark MLlib Param class promotes this idea of a "TypeConverter", so I think it's actually more confusing if we start naming things not similarly to the ones in Spark MLlib (i.e., XXConverter, toYY)... Unless @jkbradley says otherwise :-P

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with @jkbradley and he didn't have strong opinions, but since all of these functions return the input value if it is of type Y in toY(), we should keep the function names toY().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

raise TypeError(err_msg.format(type(value), value))


def _get_strict_tensor_name(_maybe_tnsr_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we already have a million of these methods.

__all__ = ['SparkDLTypeConverters']


class SparkDLTypeConverters(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that some modules in Spark do implement some static functions in classes, but I question the use of this pattern in this module, because these are all internal, disparate functions that do not need to be associated together, and we do not have a strong focus on exposing a developer API like MLlib.

@MrBago , do you see some compelling reasons for grouping functions together in this context? I knows some other modules have this pattern, but my sense is that this is not the pythonic way of doing things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why I started it this way are:

  • There are other private methods that we don't want to export. Otherwise, we have to manually specify every single exported method.
  • MLlib developers have used this pattern for a while. They may feel more comfortable using it this way.

Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like everyone is happy with the new way of testing. @phi-dbq it's ready for you to make revisions. There is a pending question of whether the SparkDLConverter class should exist, but we can handle that subsequently (not a big deal).

strs_pair_seq = []
for _maybe_tnsr_name, _maybe_col_name in value.items():
# Check if the non-tensor value is of string type
_col_name = _get_strict_col_name(_maybe_col_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These _get_strict... functions don't actually convert anything, so we don't need to call functions that looks like they do. Instead of the get_strict... functions, I'd create two functions: _check_is_str and _check_is_tensor_name that throw errors if the types are not right (the latter can use the former inside). Then this section becomes (from line 71-75):

for tensor_name, col_name in value.items():
  _check_is_str(col_name)
  _check_is_tensor_name(tensor_name)

raise TypeError(err_msg.format(type(value), value))


def _get_strict_tensor_name(_maybe_tnsr_name):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my comment above, this can be changed to _check_is_tensor_name(name) that just checks that name is a string (using _check_is_str) and that doesn't have a ":".

The current logic in _get_strict_tensor_name duplicates checks in as_tensor_name, and as_tensor_name does more than what we want here, so there is no need to call that. I understand we want to potentially give more descriptive error messages, but I think just checking that it roughly looks like a tensor name is good enough (and that's all we're doing here currently anyway).

@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 981e8dd to 7ae89e4 Compare September 28, 2017 21:48
converters simpiliciation
@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 7ae89e4 to 76e9fb9 Compare September 28, 2017 21:51
@phi-dbq phi-dbq force-pushed the tf-transformer-part1 branch from 852713a to 66507f4 Compare September 29, 2017 01:30
Copy link
Contributor Author

@phi-dbq phi-dbq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all the comments are addressed.
I will do a final round updating the docs if there is not further comments.


__all__ = ['TFImageTransformer']

IMAGE_INPUT_TENSOR_NAME = tfx.as_tensor_name(utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are only used in the current file, no need to put them inside the class.

outputMode="vector")
"""
super(TFImageTransformer, self).__init__()
self._setDefault(inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best off putting them into setParams so that it is clear they are set there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME will cause TensorFlow to throw an error since utils.IMAGE_INPUT_PLACEHOLDER_NAME cannot be interpreted as a Tensor name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run into a problem with tf_image.py in this PR or tests? Not sure why this change was made here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with inputTensor if someone never uses setParams? Seems breakable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, if anyone chose not to set inputTensor (either in the constructor or using the setter method), the default utils.IMAGE_INPUT_PLACEHOLDER_NAME will be used. (In fact, the default is always set to this value).

Later on, we will call self.getGraph().get_tensor_by_name(tensor_name), in which case if the default value is used, will cause TensorFlow to throw an exception.

It happens that all of our code set inputTensor. But if our users choose to use interact directly with TFImageTransformer and use the default, they will encounter that exception.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that __init__ calls setParams so defaults should always be set and it's okay.

# Add on the original graph
tf.import_graph_def(gdef, input_map={input_tensor_name: image_reshaped_expanded},
return_elements=[self.getOutputTensor().name],
name=self.USER_GRAPH_NAMESPACE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing private constants out of class def

return g

def _getOriginalOutputTensorName(self):
return self.USER_GRAPH_NAMESPACE + '/' + self.getOutputTensor().name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing private constants out of class def

return USER_GRAPH_NAMESPACE + '/' + self.getOutputTensor().name

def _getFinalOutputTensorName(self):
return self.NEW_OUTPUT_PREFIX + '_' + self.getOutputTensor().name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing private constants out of class def

__all__ = ['SparkDLTypeConverters']


class SparkDLTypeConverters(object):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why I started it this way are:

  • There are other private methods that we don't want to export. Otherwise, we have to manually specify every single exported method.
  • MLlib developers have used this pattern for a while. They may feel more comfortable using it this way.

@phi-dbq phi-dbq requested review from sueann and thunterdb September 29, 2017 05:38
Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most small comments. I'm not sure why changes in tf_image.py got added here. In the future, could you explain why in the PR description or comments if the changes are related to the PR, or if they are not strictly related to the PR create a different PR? We can leave it in here for this PR since I've already reviewed it.

# pylint: disable=wrong-spelling-in-docstring,invalid-name,import-error

""" SparkDLTypeConverters
Type conversion utilities for definition Spark Deep Learning related MLlib `Params`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definition -> defining

"""
.. note:: DeveloperApi
Factory methods for type conversion functions for :py:func:`Param.typeConverter`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't really "Factory methods". "Methods" should be fine.

# Conversion logic after quick type check
strs_pair_seq = []
for _maybe_col_name, _maybe_tnsr_name in value.items():
# Check if the non-tensor value is of string type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function name is self-explanatory so you don't need this comment. comments aren't alway good because the code logic can change and the comment can be forgotten and not changed. there are lots of wrong comments in many codebases.

err_msg = "Could not convert [type {}] {} to column name to tf.Tensor name mapping"
raise TypeError(err_msg.format(type(value), value))

# Conversion logic after quick type check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment - this should be obvious

for _maybe_col_name, _maybe_tnsr_name in value.items():
# Check if the non-tensor value is of string type
_check_is_str(_maybe_col_name)
# Check if the tensor name looks like a tensor name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment

raise TypeError(err_msg.format(type(value), value, exc))

@staticmethod
def buildCheckList(supportedList):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what the new name is conveying. why the change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try not to change existing code that is not relevant to the current PR unless it's really trivial (not just implementation but also the reason). it adds to the review & discussion time unnecessarily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was requested from a previous review comment.
The original name was a bit confusing as this function itself is not a "converter", but it builds a "converter" that checks if value is in the supportedList.
The name was also suggested in the review comment. I usually take these type of suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's really hard to see previous comments in github. But regardless, buildCheckList doesn't invoke any sense to me that it's building one of these Converters that checks against a list. maybe buildSupportedItemConverter or buildSupportedItemChecker. Still would go with Converter because it's the common terminology in mllib for where these are used (as the last parameter "TypeConverter" in Param definitions).

key-value object, storing parameters to be used to define the final
TensorFlow graph for the Transformer.
Currently accepted values are:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last step of this PR or in another PR? seems like a trivial change we can just make here intead of having to keep track of it for later.

return tensor_or_name
else:
return self.getGraph().get_tensor_by_name(tensor_or_name)
tensor_name = self.getOrDefault(self.inputTensor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused why this logic was changed.
1/ does tf.Graph.get_tensor_by_name(tf.Tensor) return the Tensor?
2/ let's not change files that aren't being touched at all by the main objective of the PR as it increases the review time unnecessarily. you can make mini PRs for these if you really want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. tf.Graph.get_tensor_by_name(tensor_name) returns the tensor. But it requires the tensor name, not the operation name. It eventually follows this code path.
  2. This kind of "converter-return-single-type" motif has been requested several times in the past PR comments. The changes needed to enforce it is rather small and not touching any core logic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what bug it was fixing, but in the future PLEASE put such unrelated changes in another PR. This kind of distraction causes delays in getting the PR in, as well as cause additional difficulties in debugging later on when trying to identify which PR a bug may come from. For example, most of my comments in this review round are about this file. In the future, I will ask you to break up the PR in such cases.

outputMode="vector")
"""
super(TFImageTransformer, self).__init__()
self._setDefault(inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run into a problem with tf_image.py in this PR or tests? Not sure why this change was made here.

outputMode="vector")
"""
super(TFImageTransformer, self).__init__()
self._setDefault(inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with inputTensor if someone never uses setParams? Seems breakable.

Copy link
Contributor Author

@phi-dbq phi-dbq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR for requested changes and explained other things. READY

raise TypeError(err_msg.format(type(value), value, exc))

@staticmethod
def buildCheckList(supportedList):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was requested from a previous review comment.
The original name was a bit confusing as this function itself is not a "converter", but it builds a "converter" that checks if value is in the supportedList.
The name was also suggested in the review comment. I usually take these type of suggestions.

key-value object, storing parameters to be used to define the final
TensorFlow graph for the Transformer.
Currently accepted values are:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is any change takes a hour-long rebuild for Travis CI. It's reasonable to do code change first and then doc update when the code changes are accepted.

outputMode="vector")
"""
super(TFImageTransformer, self).__init__()
self._setDefault(inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, if anyone chose not to set inputTensor (either in the constructor or using the setter method), the default utils.IMAGE_INPUT_PLACEHOLDER_NAME will be used. (In fact, the default is always set to this value).

Later on, we will call self.getGraph().get_tensor_by_name(tensor_name), in which case if the default value is used, will cause TensorFlow to throw an exception.

It happens that all of our code set inputTensor. But if our users choose to use interact directly with TFImageTransformer and use the default, they will encounter that exception.

return tensor_or_name
else:
return self.getGraph().get_tensor_by_name(tensor_or_name)
tensor_name = self.getOrDefault(self.inputTensor)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. tf.Graph.get_tensor_by_name(tensor_name) returns the tensor. But it requires the tensor name, not the operation name. It eventually follows this code path.
  2. This kind of "converter-return-single-type" motif has been requested several times in the past PR comments. The changes needed to enforce it is rather small and not touching any core logic.

Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the comment about the buildCheckList function name, all looks good to me.

return tensor_or_name
else:
return self.getGraph().get_tensor_by_name(tensor_or_name)
tensor_name = self.getOrDefault(self.inputTensor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what bug it was fixing, but in the future PLEASE put such unrelated changes in another PR. This kind of distraction causes delays in getting the PR in, as well as cause additional difficulties in debugging later on when trying to identify which PR a bug may come from. For example, most of my comments in this review round are about this file. In the future, I will ask you to break up the PR in such cases.

outputMode="vector")
"""
super(TFImageTransformer, self).__init__()
self._setDefault(inputTensor=utils.IMAGE_INPUT_PLACEHOLDER_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that __init__ calls setParams so defaults should always be set and it's okay.

Copy link
Collaborator

@sueann sueann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! will merge !!

TensorFlow graph for the Transformer.
Currently accepted values are:
Currently used values are:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make the change in a later PR, but let's add that these are optional (they are, right?)

@sueann
Copy link
Collaborator

sueann commented Oct 4, 2017

Discussed offline - will merge once all parts are approved.

@thunterdb
Copy link
Contributor

@phi-dbq let's merge all the parts into this PR, and then merge it.

* update utils

* tests

* fix style

Using the following YAPF style
========================================================
based_on_style = pep8
ALIGN_CLOSING_BRACKET_WITH_VISUAL_INDENT=True
BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF=False
COLUMN_LIMIT=100
SPACE_BETWEEN_ENDING_COMMA_AND_CLOSING_BRACKET=False
SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED=True
SPLIT_BEFORE_FIRST_ARGUMENT=False
SPLIT_BEFORE_NAMED_ASSIGNS=False
SPLIT_PENALTY_AFTER_OPENING_BRACKET=30
USE_TABS=False
========================================================

* refactoring tfx API

* test refactoring

* PR comments

1. docs in graph/utils.py

* (wip) utils test

* a few more tests for utils

* test update cont'd

* PR comments

* PR comments

* PR comments

* TensorFlow Transformer Part-3 (#10)

* intro: TFInputGraph

* tests

* Merge branch 'tf-transformer-part1' into tf-transformer-part3

* and so there is no helper classes

* and into more pieces

* class & docs

* update docs

* refactoring tfx API

* update tfx utils usage

* one way to build these tests

* tests refactored

* test cases in a single class

THis will make things easier when we want to extend other base class functions.

* shuffle things around

Signed-off-by: Philip Yang <philip.yang@databricks.com>

* docs mostly

* yapf'd

* consolidate tempdir creation

* (wip) PR comments

* more tests

* change test generator module name

* TFTransformer Part-3 Test Refactor (#14)

* profiling

* tests

* renamed test

* removed original tests

* removed the profiler utils

* fixes indents

* imports

* added some tests

* added test

* fix test

* one more test

* PR comments

* TensorFlow Transformer Part-4 (#11)

* flat param API impl

* support input graph scenarios

* (WIP) new interface implementation

* docs and cleanup

* using tensorflow API instead of our utilities

* automatic type conversion

* cleanup

* PR comments

1. Move `InputGraph` to its module.

* (WIP) address comments

* (WIP) respond to PR comments

* test refactor

* (wip) consolidating params

* rebase upstream

* import params fix

* (wip) TFInputGraph impl

* (wip) moving to new API

* (wip) enable saved_model tests

* (wip) enable checkpoint test

* (wip) enable multiple tensor tests

* enable all tests

* optimize graph for inference

* allows setting TFInputGraph

* utilize test_input_graph for transformer tests

* enable all tests

Signed-off-by: Philip Yang <philip.yang@databricks.com>

* input graph

* docs

* tensor tests

* tensor test update

* TFTransformer Part-4 Test Refactor (#15)

* adding new tests

* remove original test design

* cleanup

* deleting original testing ideas

* PR comments
@phi-dbq
Copy link
Contributor Author

phi-dbq commented Nov 22, 2017

@thunterdb thanks! We can squash and merge with a cleaned up commit message whenever we are ready.

Copy link
Contributor

@thunterdb thunterdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments that do not warrant another round of review, but that should be thought about in the future.

I will merge this PR. Thanks for the hard work @phi-dbq .

"""
if graph is not None:
return get_op(tfobj_or_name, graph).name
if isinstance(tfobj_or_name, six.string_types):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be combined with the code above

graph = tnsr.graph

# Test for op_name
yield TestCase(data=(op_name, tfx.op_name(tnsr)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be generator-style instead of lists? Lists are easier to understand.

raise TypeError('invalid tf.Operation name query type {}'.format(type(tfobj_or_name)))

def validated_output(graph, tfobj_or_name):
def validated_output(tfobj_or_name, graph):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is very dangerous to change the order of parameters in python, because you have no guarantee that you will replace all the call sites correctly (and the users of your library will have no warning, unless they use call by name).

@thunterdb thunterdb merged commit 0dc0d70 into databricks:master Nov 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants