Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
-
Updated
Oct 6, 2020 - Python
Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
Morfessor FlatCat
Morfessor EM+Prune
Central repository with pretrained models for transfer learning, BPE subword-tokenization, mono/multilingual embeddings, and everything in between.
Cognate-aware morphological segmentation
Morfessor demonstration
ICEBERT: Interlingual-Clusters Enhanced BERT. A BERT-like model trained on clusters of monolingual subwords.
The concept of DAWGs is based on: Blumer, A. et al. (1985). The smallest automation recognizing the subwords of a text. Theoretical Computer Science, 40, 31–55.
Morfessor EM+Prune
Repository for the experiments in my paper: "A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation "
Parsing and subword segmentation code for the VML-HD Dataset
Add a description, image, and links to the subword-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the subword-segmentation topic, visit your repo's landing page and select "manage topics."