Skip to content
Miikka Silfverberg edited this page Jan 9, 2018 · 13 revisions

General

Welcome to the FinnPos wiki!

FinnPos is an open-source statistical morphological tagging and lemmatization toolkit for Finnish and other morphologically rich languages. It is based on the CRF framework.

The FinnPos toolkit includes a high accuracy tagger for Finnish, ftb-label. It is trained on the FinnTreeBank 1 corpus and uses the Finnish open-source morphological analyzer OMorFi. For information on building and using ftb-label, see FinnTreeBank tagger.

For advice on installation and system requirements, see Build and Install.

Utilities

FinnPos includes the following utilities:

finnpos-train -- A utility for training your own models (see Training your own models).

finnpos-label -- A utility for morphological tagging of text using an existing model (see Tagging and lemmatization).

finnpos-eval -- A utility for comparing a gold standard and tagged corpus (see Tagging and lemmatization).

finnpos-ratna-feats.py -- A feature extraction script which extracts a pre-defined set of features (see Feature extraction).

ftb-label -- A tagger and lemmatizer for Finnish.

Clone this wiki locally