Implement Gradient Accumulation #2

stefan-falk · 2021-07-20T07:19:53Z

@fsx950223 want to discuss here?

fsx950223 · 2021-07-20T07:23:18Z

What's your distribute strategy?

stefan-falk · 2021-07-20T07:24:25Z

For training my model I use MirroredStrategy()

fsx950223 · 2021-07-20T07:33:25Z

Does the bug happen on single GPU?

stefan-falk · 2021-07-20T07:35:38Z

Does the bug happen on single GPU?

Your code works for me on CPU and single GPU but not on multi-GPU.

tensorflow_addons/optimizers/gradient_accumulator.py

fsx950223 · 2021-07-20T07:48:14Z

Could you debug my code with tf.config.run_functions_eagerly(True) and debug embedding with it too.

fsx950223 · 2021-07-20T07:50:09Z

Add a test case for embedding.

stefan-falk · 2021-07-20T07:58:58Z

Could you debug my code with tf.config.run_functions_eagerly(True) and debug embedding with it too.

Your code seems to work with Embeeding.

I added a test for the Embedding and I am now always adding a dimension to the indices of theIndexedSlices-object. This is a bit odd..

tensorflow_addons/optimizers/tests/gradient_accumulator_test.py

stefan-falk · 2021-07-20T08:26:37Z

tensorflow_addons/optimizers/gradient_accumulator.py


            if isinstance(grad, tf.IndexedSlices):
+                # Not sure why e.g. the Embedding layer requires an additional dimension here
+                grad_indices = grad.indices[..., None] if len(grad.indices.shape) < 2 else grad.indices


This is just weird 🤷‍♂️

tensorflow_addons/optimizers/tests/gradient_accumulator_test.py

stefan-falk · 2021-07-20T08:53:23Z

tensorflow_addons/optimizers/tests/gradient_accumulator_test.py

        return dataset

-    strategy = tf.distribute.MirroredStrategy(test_utils.gpus_for_testing())
+    strategy = tf.distribute.get_strategy()


@fsx950223 I noticed that I used the wrong strategy here.

fsx950223 · 2021-07-20T13:49:57Z

It's a good idea to use two-pointers to accumulate gradients. It's better to beautify the code.

stefan-falk · 2021-07-21T06:09:15Z

It's a good idea to use two-pointers to accumulate gradients

What do you mean?

btw: do we need this PR or should I close it?

fsx950223 · 2021-07-21T07:49:13Z

Since the google cla issue, you should open another PR without my commits.
Before that, you should wrapper code for general usage.

fsx950223 and others added 3 commits July 20, 2021 01:39

add name

4dbc208

Implement GA alternative

b314592

Run black formatter

222e757

stefan-falk added 2 commits July 20, 2021 09:30

Merge branch 'ga' into ga-alt

1c7bb61

Update gradient_accumulator.py

5e75bd7

Fixing code mess up

b31c896

fsx950223 reviewed Jul 20, 2021

View reviewed changes

tensorflow_addons/optimizers/gradient_accumulator.py Show resolved Hide resolved

tensorflow_addons/optimizers/gradient_accumulator.py Show resolved Hide resolved

stefan-falk commented Jul 20, 2021

View reviewed changes

tensorflow_addons/optimizers/gradient_accumulator.py Outdated Show resolved Hide resolved

Add embedding test

a03643e

stefan-falk commented Jul 20, 2021

View reviewed changes

tensorflow_addons/optimizers/tests/gradient_accumulator_test.py Show resolved Hide resolved

Add currently failing LSTM-test

0a8e686

stefan-falk commented Jul 20, 2021

View reviewed changes

tensorflow_addons/optimizers/tests/gradient_accumulator_test.py Show resolved Hide resolved

use default-strategy

40b6e38

stefan-falk commented Jul 20, 2021

View reviewed changes

stefan-falk added 2 commits July 20, 2021 14:20

Use custom implementation for GA

6db2187

Some cleaning up

4142f37

fsx950223 force-pushed the ga branch from 4dbc208 to 7d2d553 Compare July 21, 2021 02:57

fsx950223 force-pushed the ga branch from 7d2d553 to 67c1e8e Compare July 21, 2021 08:11

Implement Gradient Accumulation #2

Are you sure you want to change the base?

Implement Gradient Accumulation #2

Uh oh!

Conversation

stefan-falk commented Jul 20, 2021

Uh oh!

fsx950223 commented Jul 20, 2021

Uh oh!

stefan-falk commented Jul 20, 2021

Uh oh!

fsx950223 commented Jul 20, 2021

Uh oh!

stefan-falk commented Jul 20, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fsx950223 commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fsx950223 commented Jul 20, 2021

Uh oh!

stefan-falk commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

stefan-falk Jul 20, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stefan-falk Jul 20, 2021

Choose a reason for hiding this comment

Uh oh!

fsx950223 commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefan-falk commented Jul 21, 2021

Uh oh!

fsx950223 commented Jul 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fsx950223 commented Jul 20, 2021 •

edited

Loading

stefan-falk commented Jul 20, 2021 •

edited

Loading

fsx950223 commented Jul 20, 2021 •

edited

Loading

fsx950223 commented Jul 21, 2021 •

edited

Loading