Pytorch Implementation of Character-Aware Neural Language Models by Kim et al., AAAI 2016
- Install cuda-8.0
- Install cudnn-v5.1
- Install Pytorch 0.4.0
- Python version >= 3.5 is required
- English Penn Treebank dataset downloaded from here
- Place train.txt, valid.txt, test.txt files under (home)/datasets/ptb/ directory.
# Preprocessing dataset. This will create ./data/preprocess(tmp).pkl
$ python dataset.py
# Train and test with default settings (LSTM-Char-Small)
$ python main.py
# Train with different number of hidden units and epochs
$ python main.py --hidden_dim 200 --epoch 20
- Refer to the paper for more detailed explanations of the model.
Reported PPL | Our Implementation (valid) | Our Implementation (test) |
---|---|---|
92.3 | 71.1 | 108.9 |
MIT