Skip to content

Daviderikmollberg/icelandic-NLP-resources

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Icelandic NLP resources

This is an list of known tools and resources developed specifically to do linguistic processing in Icelandic. It is intended to give readers a clear overview of the ever-growing arsenal of tools for working with Icelandic natural language data at a glance.

This list is categorized by task to increase clarity. Due to that, some multi-functional tools and toolkits might appear more than once in the list. If you notice a category or resource is missing or have suggestions on how to improve this list, please open a pull request.

Contents

Notable papers and reports

Other resource collections

  • CLARIN-IS
    • The Icelandic branch of the CLARIN-ERIC language resource initiative. Contains information on and downloads for many tools and datasets.
  • malfong.is
    • List of language technology resources, maintained by Árnastofnun.

Toolkits

  • Java toolkit which does tokenization, POS tagging, lemmatization, parsing and NER
  • Developed by Hrafn Loftsson
  • TTS frontend designed to work with the Merlin speech synthesis system developed by CSTR
  • It contains a pronunciation dictionary, sequitur g2p model, stress analysis component and more. Unfortunately it does not include any documentation.
    • Developed by Anna Björk Nikulásdóttir at LVL

Tokenization and text normalization

POS tagging

Syntactic parsing

Grapheme-to-phoneme

Stress analysis

About

Overview of Icelandic NLP resources at a glance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published