-
Notifications
You must be signed in to change notification settings - Fork 10
Home
Welcome to the FinnPos wiki!
FinnPos is an open-source statistical morphological tagging and lemmatization toolkit for Finnish and other morphologically rich languages. It is based on the CRF framework.
The FinnPos toolkit includes a high accuracy tagger for Finnish, ftb-label
. It is trained on
the FinnTreeBank 1 corpus
and uses the Finnish open-source morphological analyzer OMorFi. For information on building and using ftb-label
, see FinnTreeBank tagger.
For advice on installation and system requirements, see Build and Install.
FinnPos includes the following utilities:
finnpos-train
-- A utility for training your own models (see Training your own models).
finnpos-label
-- A utility for morphological tagging of text using an existing model (see Tagging and lemmatization).
finnpos-eval
-- A utility for comparing a gold standard and tagged corpus (see Tagging and lemmatization).
finnpos-ratna-feats.py
-- A feature extraction script which extracts a pre-defined set of features (see Feature extraction).
ftb-label
-- A tagger and lemmatizer for Finnish.