Skip to content

denialbb/dependency-parser-it

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

  1. Reference
  2. Running
  3. Test
  4. Getting the data

Reference

Running

usage: parser.py [-h] [--test heldout golden] [--train training n_iterations]
                 [-q sentence]

A compact dependency parser. Uses .conll for data.

options:
  -h, --help            show this help message and exit
  --test heldout golden
                        Test the parser, args are heldout pos and test data
                        .conll
  --train training n_iterations
                        Train the parser, args are training data .conll and
                        number of iterations
  -q sentence, --query sentence
                        Parse dependency for query

In general use with:

python parser.py --train wsj_train.dep
python parser.py --test wsj_train.pos wsj_test.dep
python parser.py --query 'test sentence'

Test

To test you can run:

python parser.py -h
python parser.py --train data/it_isdt-ud-train.conll 15
python parser.py --test data/heldout.pos data/it_isdt-ud-test.conll
python parser.py -q 'essere guerra con il nemico " .'

Getting the data

Obtain wsj_train.dep 1:

for f in $1/*.mrg; do
    echo $f
    grep -v CODE $f > "$f.2"
    out="$f.dep"
    java -mx800m -cp "$scriptdir/*:" edu.stanford.nlp.trees.EnglishGrammaticalStructure \
-treeFile "$f.2" -basic -makeCopulaHead -conllx > $out
done

Convert to conll-x format2:

perl conllu_to_conllx.pl < file.conllu > file.conll

Extract word/tag tuples for the test sentences:

../tools/extract-pos it_isdt-ud-test.conllu

Footnotes

1 https://explosion.ai/blog/parsing-english-in-python

2 https://github.com/UniversalDependencies/tools

About

An Projective Dependency Parser for Italian

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published