Supertagging

Supertagging in Alto

Alto has experimental support for supertagging in the alto_supertagging fork.

To parse sentences with a TAG grammar using the supertagger, you need the following files:

A Tensorflow model bundle representing the trained supertagger. This can be created from a trained model using the convert.py script in the nTagM package.
The output vocabulary file that was created when training the supertagger.
A file with word embeddings for the input words - in particular, glove.6B.200d.txt. The file may be gzipped to save a few hundred MB of disk space.
The Chen grammar (d6.clean.f.str).
A supertagger configuration file (e.g. config_dropbox.yaml), which specifies (among other things) the filenames of the output vocabulary and the word embeddings.