Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
flow_attention.py		flow_attention.py

README.md

Flowformer for Language Modeling

We follow the official code base of [fairseq] and implement Flowformer upon that repo.

Since fairseq a quite large code base, we only provide the changed module and our experimental configuration. You can incorporate flow_attention.py to fairseq for reproduction.

Figure 1. Results on Wikitext-103.

Get Started

Solve the environment and download the dataset follows the tutorial of [Language Modeling].
Replace the ./fairseq/modules/multihead_attention.py by our provided flow_attention.py.
Train and evaluate the model by the following scripts. You can get the pretrained model from [here].

fairseq-train --task language_modeling \
  data-bin/wikitext-103 \
  --save-dir checkpoints/flowformer \
  --arch transformer_lm --share-decoder-input-output-embed \
  --dropout 0.1 \
  --optimizer adam --adam-betas '(0.9, 0.98)' --weight-decay 0.01 --clip-norm 0.0 \
  --lr 0.001 --lr-scheduler inverse_sqrt --warmup-updates 6000 --warmup-init-lr 1e-07 \
  --tokens-per-sample 512 --sample-break-mode none \
  --max-tokens 2048 --update-freq 16 \
  --max-update 150000

fairseq-eval-lm data-bin/wikitext-103 \
    --path checkpoints/flowformer/checkpoint_best.pt \
    --batch-size 2 \
    --tokens-per-sample 512 \
    --context-window 400

Acknowledgement

We code base is built upon on the official code of fairseq:

https://github.com/facebookresearch/fairseq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flowformer_NLP

Flowformer_NLP

README.md

Flowformer for Language Modeling

Get Started

Acknowledgement

Files

Flowformer_NLP

Directory actions

More options

Directory actions

More options

Latest commit

History

Flowformer_NLP

Folders and files

parent directory

README.md

Flowformer for Language Modeling

Get Started

Acknowledgement