Zoëga's A Concise Dictionary of Old Icelandic parser
With the reader.py module, you can:
- search a word with an edit distance below a given threshold,
- extract the POS tags in dictionary entries,
- search for exact entry and approximate entry.
However POS tag extractor is not very efficient. More special cases need to be handled.
TODO list:
- look up a word in dictionary
- search all words in dictionary which are at most at a given edit distance with the query word
- for a given dictionary entry, give all its inflected forms (partially done),
- handle more dictionary entry,
- process all entries so that we would get virtually all the Old Norse words,
- for each form, we can associate lemmas with a proposed POS tag.
Data come from https://github.com/GreekFellows/lesser-dannatt and http://norroen.info/dct/zoega by Tim Ermolaev.
Then utils.first_step()
is launched. Files are modified in order to ensure
XML syntax consistency, finally utils.second_step()
is launched.