TODO list for Transformer.

- [x] implement they layer normalization operator and the Python wrapper.
    - [x] CPU implementation.
    - [x] GPU implementation.
    - [x] python wrapper.
- [x] enhance the matmul operator to support 4-D tensor as its inputs https://github.com/PaddlePaddle/Paddle/issues/7319. fixed by PR: https://github.com/PaddlePaddle/Paddle/pull/7656
- [x] prepare [the dataset](https://arxiv.org/abs/1605.00459).  
 fixed by PR: https://github.com/PaddlePaddle/Paddle/pull/7661
- [x] wrap **the masked** positional embedding.
- [x] enhance the lookup_table operator to support the special token: padding index.  https://github.com/PaddlePaddle/Paddle/issues/7309. 
- [x] wrap the multi-head dot product attention. This is different to ConvS2S.
- [x] wrap the positional-wise feed-forward network. 
- [x] wrap the basic computation block.
- [x] build the entire model.
- [x] enhance the documentation of operators used in Transformer.
- [x] add beam search for Transformer.
- [x] clean codes and merge the entire project into the [models](https://github.com/PaddlePaddle/models) repo (merge the work part by part).
- [x] Learning Rate Scheduler
- [x] Residual Dropout
- [x] Label Smoothing
    - [x] label smooth operator.
    - [x] python wrapper.
- [x] Scaled Dot Product Attention
- [x] Weight sharing between embedding and pre-softmax linear transformation layers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO list for Transformer. #7355

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TODO list for Transformer. #7355

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions