nn-depparse

A Torch/Scala reimplementation of the neural network dependency parser described in Chen and Manning '14.

Requirements

Setup

Set up environment variables. For example, from the root directory of this project on blake:

export NNDEPPARSE_ROOT=`pwd`
export DATA_DIR=/iesl/canvas/strubell/data/

Put a word embeddings file in $NNDEPPARSE_ROOT/data/embeddings. The file is expected to contain one embedding per line, where the first field is the token and the remaining fields are the values of the embedding, each field separated by a single space. You can get the Collobert et al. embeddings here.
Compile: sbt compile
Perform all data preprocessing for a given configuration [also compiles]. For example:

./bin/all-data-processing.sh config/chen-ptb.conf

Running

Train the parser:

./bin/train-parser.sh config/chen-ptb.conf

Evaluate the parser (accuracy and speed):

./bin/parse-fast.sh config/chen-ptb.conf

Tune hyperparameters (assumes a GPU machine and uses all of its GPUs):

./bin/tune-hyperparams.sh config/chen-ptb.conf

[optional detail] Generating training data

Generate parse decisions + features for training from PTB: ./bin/get-parse-decisions-ptb.sh
Generate intmaps from parse decisions + features: ./bin/convert-ptb-feats-to-ints.sh
Generate Torch tensors from intmaps: ./bin/convert-ptb-feats-to-torch.sh
If word intmaps changed, generate Torch embedding tensors: ./bin/convert-collobert-embeddings-to-torch.sh

[optional detail] Generating test data

Generate dev/test intmaps for each sentence in PTB: ./bin/convert-ptb-sents-to-ints.sh
Generate Torch tensors from sentence intmaps: ./bin/convert-ptb-sents-to-torch.sh

[optional] Luajit hack to allow array-of-size def

In torch-distro/exe/luajit-rocks/luajit-2.1/src/luajit.c, add the function:

static int new_sized_table( lua_State *L )
{
    int asize = lua_tointeger( L, 1 );
    int hsize = lua_tointeger( L, 2 );
    lua_createtable( L, asize, hsize );
    return( 1 );
}

in main, after L is initialized add the lines:

lua_pushcfunction( L, new_sized_table );
lua_setglobal( L, "sized_table" );

Reinstall Torch.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
bin		bin
config		config
src/main		src/main
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nn-depparse

Requirements

Setup

Running

[optional detail] Generating training data

[optional detail] Generating test data

[optional] Luajit hack to allow array-of-size def

About

Releases

Packages

Languages

strubell/nn-depparse

Folders and files

Latest commit

History

Repository files navigation

nn-depparse

Requirements

Setup

Running

[optional detail] Generating training data

[optional detail] Generating test data

[optional] Luajit hack to allow array-of-size def

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages