- ISDT: https://github.com/UniversalDependencies/UD_Italian-ISDT
- used commit
5bb0bf3
- used commit
- https://gist.github.com/syllog1sm/10343947
usage: parser.py [-h] [--test heldout golden] [--train training n_iterations]
[-q sentence]
A compact dependency parser. Uses .conll for data.
options:
-h, --help show this help message and exit
--test heldout golden
Test the parser, args are heldout pos and test data
.conll
--train training n_iterations
Train the parser, args are training data .conll and
number of iterations
-q sentence, --query sentence
Parse dependency for query
In general use with:
python parser.py --train wsj_train.dep
python parser.py --test wsj_train.pos wsj_test.dep
python parser.py --query 'test sentence'
To test you can run:
python parser.py -h
python parser.py --train data/it_isdt-ud-train.conll 15
python parser.py --test data/heldout.pos data/it_isdt-ud-test.conll
python parser.py -q 'essere guerra con il nemico " .'
Obtain wsj_train.dep 1:
for f in $1/*.mrg; do
echo $f
grep -v CODE $f > "$f.2"
out="$f.dep"
java -mx800m -cp "$scriptdir/*:" edu.stanford.nlp.trees.EnglishGrammaticalStructure \
-treeFile "$f.2" -basic -makeCopulaHead -conllx > $out
done
Convert to conll-x format2:
perl conllu_to_conllx.pl < file.conllu > file.conll
Extract word/tag tuples for the test sentences:
../tools/extract-pos it_isdt-ud-test.conllu