Skip to content

LeonCrashCode/InOrderParser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InOrderParser

This implementation is based on the cnn library for this software to function. The reference paper is "In-Order Transition-based Constituent Parsing System"

Building

mkdir build
cd build
cmake .. -DEIGEN3_INCLUDE_DIR=/path/to/eigen
make

Data

We borrow the code get_oracle.py to get top-down oracle

./get_oracle.py [training data in bracketed format] [training data in bracketed format] > [training top-down oracle]
./get_oracle.py [training data in bracketed format] [development data in bracketed format] > [development top-down oracle]   
./get_oracle.py [training data in bracketed format] [test data in bracketed format] > [test top-down oracle]

, and then compile pre2mid.cc to get pre2mid to convert them into in-order oracle

g++ pre2mid.cc -o pre2mid
./pre2mid [training top-down oracle] > [training oracle]
./pre2mid [development top-down oracle] > [development oracle]
./pre2mid [test top-down oracle] > [test oracle]

If you require the related data, contact us.

Training

Ensure the related file are linked into the current directory.

mkdir model/
./build/impl/Kparser --cnn-mem 1700 -x -T [training oracle] -d [development oracle] -C [development data in bracketed format] -P -t --pretrained_dim 100 -w [pretrained word embeddings] --lstm_input_dim 128 --hidden_dim 128 -D 0.2

Test

./build/impl/Kparser --cnn-mem 1700 -x -T [training oracle] -p [test oracle] -C [test data in bracketed format] -P --pretrained_dim 100 -w [pretrained word embeddings] --lstm_input_dim 128 --hidden_dim 128 -m [model file]

We provide the trained model file in model

Sampling

./build/impl/Kparser --cnn-mem 1700 -x -T [training oracle] -p [test oracle] -C [test data in bracketed format] -P --pretrained_dim 100 -w [pretrained word embeddings] --lstm_input_dim 128 --hidden_dim 128 -m [model file] --alpha 0.8 -s 100 > samples.act
./mid2tree.py samples.act > samples.trees

The samples.props could be fed into following reranking components.

Contact

Jiangming Liu, jmliunlp@gmail.com