Expand Sequence list #692

us · 2019-06-19T10:57:54Z

Sequence gets expanded, for better readability.

Related issue #689

Sequence gets expanded, for better readability. Related issue tensorflow#689 Please enter the commit message for your changes. Lines starting

Conchylicultor · 2019-06-19T16:35:15Z

Thank you for fixing this. This fix seems a little hacky though.
I would prefer to use an more explicit check:

if isinstance(feature, FeaturesDict) or (
    isinstance(feature, Sequence) and isinstance(feature.feature, FeaturesDict)
):

us · 2019-06-19T21:11:42Z

I think for every complicated structure we need to new line. I thought of using {} to break lines. And all keys feature shown with {}.

us · 2019-06-19T21:14:51Z

At first I did like you, but I chose to write more general code but I can go back if you think code style should more understandable and readable.

Conchylicultor · 2019-06-20T20:45:07Z

The problem with if hasattr(v, 'keys'): is that if another FeatureConnector implement a .key attribute, the code will crash or behave in some unexpected way. I agree that your solution is shorter, but seems more hacky.

Alternatively, would it be possible to modify the FeatureDict.__repr__ to be displayed in multiple lines, such as print(builder.info.feature) is also displayed properly.

print(builder.info.features)
FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
})

instead of currently:

FeaturesDict({'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10), 'image': Image(shape=(28, 28, 1), dtype=tf.uint8)})

Then you could update the usage in DatasetInfo.__repr__ and _pprint_features_dict in the doc wouldn't be required anymore.

…document_dataset_bug

`tensorflow_datasets.scripts.document_datasets.pprint_features_dict` converted to public function and add types parameter.

us · 2019-06-29T15:15:18Z

@Conchylicultor I find an alternative solution please check it.

Conchylicultor

Thanks, this looks much better.

tensorflow_datasets/core/features/features_dict.py

tensorflow_datasets/scripts/document_datasets.py

Conchylicultor · 2019-07-09T02:07:29Z

tensorflow_datasets/core/features/features_dict.py

@@ -138,7 +139,7 @@ def __iter__(self):

  def __repr__(self):
    """Display the feature dictionary."""
-    return '{}({})'.format(type(self).__name__, self._feature_dict)
+    return pprint_features_dict(self._feature_dict, types='FeaturesDict')


Instead of calling pprint_features_dict and have the kwargs hack which seems hacky, I feel it would be nicer to have FeaturesDict simply calling repr() recursively on its child.

Something like:

def __repr__(self): lines = ['{}({'.format(type(self).__name__)] for key, feature in sorted(list(features_dict.items())): all_sub_lines = '\'{}\': {},'.format(key, feature) lines.extend(' ' + l for l in all_sub_lines.split('\n')) # Add indentation to all childs lines.append('})') return lines.join('\n')

This is more generic as this would works with any feature which is using multiple line. And it avoid having to test insinstance(, FeatureDict) or hasattr.
What do you think ?

I'm agree with you but now we don't use the pprint_features_dict func.

Move `pprint_features_dict` func to `features_dict.py`

Conchylicultor · 2019-07-11T18:00:39Z

Great, thank you. Looks good. I'm merging it now.
Yes, pprint_features_dict isn't used anymore, so I'll remove it internally before merging.

Also, in features_dict.py, be careful not to import the full API import tensorflow_datasets as tfds, as it creates circular dependency. Instead individual modules should be imported. I'm fixing this internally before merging it too.

PiperOrigin-RevId: 257705157

Expand Sequence list

b2b3c6e

Sequence gets expanded, for better readability. Related issue tensorflow#689 Please enter the commit message for your changes. Lines starting

googlebot added the cla: yes Author has signed CLA label Jun 19, 2019

Add __repr__ method to Sequence.

c0ae7b5

pierrot0 assigned Conchylicultor Jun 27, 2019

us added 2 commits June 28, 2019 17:51

Merge branch 'master' of https://github.com/tensorflow/datasets into …

2d24ef6

…document_dataset_bug

Add special format for FeaturesDict.__repr__

a6f2668

`tensorflow_datasets.scripts.document_datasets.pprint_features_dict` converted to public function and add types parameter.

Conchylicultor requested changes Jul 9, 2019

View reviewed changes

Change FeaturesDict.__repr__ method.

f9f8b37

Move `pprint_features_dict` func to `features_dict.py`

tfds-copybara merged commit f9f8b37 into tensorflow:master Jul 11, 2019

tfds-copybara pushed a commit that referenced this pull request Jul 11, 2019

Merge pull request #692 from us:document_dataset_bug

5d29c6a

PiperOrigin-RevId: 257705157

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand Sequence list #692

Expand Sequence list #692

us commented Jun 19, 2019

Conchylicultor commented Jun 19, 2019

us commented Jun 19, 2019

us commented Jun 19, 2019

Conchylicultor commented Jun 20, 2019 •

edited

Loading

us commented Jun 29, 2019

Conchylicultor left a comment

Conchylicultor Jul 9, 2019

us Jul 11, 2019

Conchylicultor commented Jul 11, 2019

Expand Sequence list #692

Expand Sequence list #692

Conversation

us commented Jun 19, 2019

Conchylicultor commented Jun 19, 2019

us commented Jun 19, 2019

us commented Jun 19, 2019

Conchylicultor commented Jun 20, 2019 • edited Loading

us commented Jun 29, 2019

Conchylicultor left a comment

Choose a reason for hiding this comment

Conchylicultor Jul 9, 2019

Choose a reason for hiding this comment

us Jul 11, 2019

Choose a reason for hiding this comment

Conchylicultor commented Jul 11, 2019

Conchylicultor commented Jun 20, 2019 •

edited

Loading