Opensource code for Deep Transformer with Latent Depth #2703

xianxl · 2020-10-07T02:50:17Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Opensource code for Deep Transformer with Latent Depth (https://arxiv.org/pdf/2009.13102.pdf).

New features and design choices made:

New feature: allow non-residual block to be weighted by sample z (generated per batch) instead of x = residual + x.
Design choice: move x = residual + x in transformer_layer.py into a function where the subclass (with latent depth) could overwrite it to x = residual + z*x.
New feature: allow TransformerEncoder or TransformerDecoder to have additional logits parameters which will generate the samples z.
Design choice: added subclass LatentTransformerEncoder and LatentTransformerDecoder, which has additional attributes for the logits parameters, and instantiate the corresponding LatentTransformerEncoderLayer and LatentTransformerDecoderLayer.
New feature: allow multilingual_translation task to train with latent depth (results in the paper).
Design choice:
- added additional arguments in the multilingual_translation task.
- added option for multilingual_transformer to use LatentTransformerEncoder and LatentTransformerDecoder besides standard TransformerEncoder.
- added option in multilingual_translation task's train_step to generate the samples z and compute the KL (and sparsity) loss per batch.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

myleott · 2020-10-13T16:49:31Z

examples/latent_depth/src/models/latent_multilingual_transformer.py

@@ -0,0 +1,52 @@
+from fairseq.models import (


add copyright header

myleott · 2020-10-13T17:00:12Z

examples/latent_depth/src/multilingual_translation_latent_depth.py

@@ -0,0 +1,148 @@
+from fairseq.tasks import register_task


add copyright header

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@xianxl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-10-15T18:08:35Z

@xianxl merged this pull request in 573c2f4.

Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Opensource code for Deep Transformer with Latent Depth (https://arxiv.org/pdf/2009.13102.pdf). New features and design choices made: - New feature: allow non-residual block to be weighted by sample z (generated per batch) instead of `x = residual + x`. - Design choice: move `x = residual + x` in transformer_layer.py into a function where the subclass (with latent depth) could overwrite it to `x = residual + z*x`. - New feature: allow TransformerEncoder or TransformerDecoder to have additional logits parameters which will generate the samples z. - Design choice: added subclass LatentTransformerEncoder and LatentTransformerDecoder, which has additional attributes for the logits parameters, and instantiate the corresponding LatentTransformerEncoderLayer and LatentTransformerDecoderLayer. - New feature: allow multilingual_translation task to train with latent depth (results in the paper). - Design choice: - added additional arguments in the multilingual_translation task. - added option for multilingual_transformer to use LatentTransformerEncoder and LatentTransformerDecoder besides standard TransformerEncoder. - added option in multilingual_translation task's `train_step` to generate the samples z and compute the KL (and sparsity) loss per batch. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: facebookresearch/fairseq#2703 Reviewed By: myleott Differential Revision: D24155059 Pulled By: xianxl fbshipit-source-id: f3e41639429f9664ec5565839709aa857a643668

Opensource code for Deep Transformer with Latent Depth

3dcc0c6

facebook-github-bot reviewed Oct 7, 2020

View reviewed changes

address comments and nits

bdfc36d

facebook-github-bot reviewed Oct 8, 2020

View reviewed changes

move latent depth code to examples/latent_depth/src and added test

80e3b97

facebook-github-bot reviewed Oct 9, 2020

View reviewed changes

fix previous commit

8b0b454

facebook-github-bot reviewed Oct 9, 2020

View reviewed changes

move latent depth code to examples/latent_depth/src and added test

5095f69

facebook-github-bot reviewed Oct 9, 2020

View reviewed changes

Xian Li added 6 commits October 8, 2020 23:43

add set_lang_idx in valid

86c9351

fix for test failure in generate

c7c802e

move everythin to examples/

efcd087

fix

ff3d67f

fix

3673b01

fix test config

72ac8f2

myleott reviewed Oct 13, 2020

View reviewed changes

examples/latent_depth/src/models/latent_multilingual_transformer.py

@@ -0,0 +1,52 @@

from fairseq.models import (

Copy link

myleott Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add copyright header

add header

65c0545

myleott reviewed Oct 13, 2020

View reviewed changes

facebook-github-bot reviewed Oct 13, 2020

View reviewed changes

fix test import

de0aceb

facebook-github-bot reviewed Oct 13, 2020

View reviewed changes

lint

fe8e08b

facebook-github-bot reviewed Oct 13, 2020

View reviewed changes

facebook-github-bot closed this in 573c2f4 Oct 15, 2020

facebook-github-bot added the Merged label Oct 15, 2020

myleott deleted the latent-depth-oss branch October 17, 2020 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opensource code for Deep Transformer with Latent Depth #2703

Opensource code for Deep Transformer with Latent Depth #2703

xianxl commented Oct 7, 2020

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

myleott Oct 13, 2020

myleott Oct 13, 2020

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

facebook-github-bot commented Oct 15, 2020

Opensource code for Deep Transformer with Latent Depth #2703

Opensource code for Deep Transformer with Latent Depth #2703

Conversation

xianxl commented Oct 7, 2020

Before submitting

What does this PR do?

PR review

Did you have fun?

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

myleott Oct 13, 2020

Choose a reason for hiding this comment

myleott Oct 13, 2020

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Oct 15, 2020