Closed
Description
- [ ] prepare the dataset https://arxiv.org/abs/1605.00459. a shared task with Transformer
- [ ] enhance the lookup_table operator to support the special token: padding index. #7309. a shared task with Transformer
- implement a general purpose normalization operator A general purpose normalization operator is needed. #7350. @lcy-seso
- wrap the weight normalization.
- l2 normalize layer.
- wrap weight normalization.
- serialize weight normalization for inference.
- wrap the positional embedding.
- wrap the dot product attention. Add Python wrapper for dot-product-attention #7602
- wrap GLU unit. Add python wrapper for GLU #7525
- wrap deep convolution encoding and decoding block with attention.
- build the entire model.
- enhance the documentation of operators used in ConvS2S.
- add beam search for ConvS2S.
- merge the entire model into the models repo (actually can merge the work part by part).
- clip by norm