Skip to content

A PyTorch implementation of Recurrent Additive Networks by Lee et al. (2017)

Notifications You must be signed in to change notification settings

bheinzerling/ran

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recurrent Additive Networks

Note: This code is not up-to-date, please refer to the implementation by the original authors: https://github.com/kentonl/ran


This is a PyTorch implementation of Recurrent Additive Networks (RAN) by Kenton Lee, Omer Levy, and Luke Zettlemoyer:

http://www.kentonl.com/pub/llz.2017.pdf

The RAN model is implemented in ran.py.

Code for running Penn Tree Bank (PTB) experiments is taken from:

https://github.com/pytorch/examples/tree/master/word_language_model

To run PTB experiments, clone this repository:

git clone https://github.com/bheinzerling/ran

and then do:

cd ran
python main.py --cuda --emsize 256 --nhid 1024 --dropout 0.5 --epochs 100 --nlayers 1 --batch-size 512 --model RAN

This should result in a test set perplexity which roughly agrees with the RAN (tanh) result reported in the paper:

End of training | test loss  4.78 | test ppl   119.40

Better results can be achieved with smaller batch sizes, e.g. with batch size 40:

End of training | test loss  4.45 | test ppl    85.24

batch size 20:

| End of training | test loss  4.42 | test ppl    83.42

batch size 10:

| End of training | test loss  4.41 | test ppl    82.62

batch size 5:

| End of training | test loss  4.49 | test ppl    89.21

About

A PyTorch implementation of Recurrent Additive Networks by Lee et al. (2017)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages