Skip to content

Conversation

@stefan-falk
Copy link

@fsx950223 want to discuss here?

@fsx950223
Copy link
Owner

What's your distribute strategy?

@stefan-falk
Copy link
Author

For training my model I use MirroredStrategy()

@fsx950223
Copy link
Owner

Does the bug happen on single GPU?

@stefan-falk
Copy link
Author

Does the bug happen on single GPU?

Your code works for me on CPU and single GPU but not on multi-GPU.

@fsx950223
Copy link
Owner

fsx950223 commented Jul 20, 2021

Could you debug my code with tf.config.run_functions_eagerly(True) and debug embedding with it too.

@fsx950223
Copy link
Owner

Add a test case for embedding.

@stefan-falk
Copy link
Author

stefan-falk commented Jul 20, 2021

Could you debug my code with tf.config.run_functions_eagerly(True) and debug embedding with it too.

Your code seems to work with Embeeding.

I added a test for the Embedding and I am now always adding a dimension to the indices of theIndexedSlices-object. This is a bit odd..


if isinstance(grad, tf.IndexedSlices):
# Not sure why e.g. the Embedding layer requires an additional dimension here
grad_indices = grad.indices[..., None] if len(grad.indices.shape) < 2 else grad.indices
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just weird 🤷‍♂️

return dataset

strategy = tf.distribute.MirroredStrategy(test_utils.gpus_for_testing())
strategy = tf.distribute.get_strategy()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsx950223 I noticed that I used the wrong strategy here.

@fsx950223
Copy link
Owner

fsx950223 commented Jul 20, 2021

It's a good idea to use two-pointers to accumulate gradients. It's better to beautify the code.

@stefan-falk
Copy link
Author

It's a good idea to use two-pointers to accumulate gradients

What do you mean?

btw: do we need this PR or should I close it?

@fsx950223
Copy link
Owner

fsx950223 commented Jul 21, 2021

Since the google cla issue, you should open another PR without my commits.
Before that, you should wrapper code for general usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants