Skip to content

Unsupervised morphological analysis of words in a lexicon given a raw corpus in which they appear

Notifications You must be signed in to change notification settings

alexerdmann/ParadigmDiscovery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

Unsupervised Paradigm Discovery

Overview

This is the repository for our ACL 2020 paper The Paradigm Discovery Problem. This code release represents the published system, though better results have been achieved since the paper was accepted. If you're interested in the improved code, please reach out to me.

Prerequisites

This repository was tested using TensorFlow 1.14. I tried to make the code forwards compatible with TensorFlow 2.0. In theory, you should only have to comment out line 140 in Scripts/ANA.py, though I have not tested this.

Usage

After unzipping the Data directory, first select the language and part of speech you want to run.

lgPOS=ara_N  # the other supported language-POS's are deu_N, eng_V, lat_N or rus_N

Then run the system with the following command:

python Scripts/ANA.py -C Data/Corpora/corp.$lgPOS -L Data/Lexica/lex.$lgPOS -m MyModel -l $lgPOS -U Data/LexiconUDintersects/inter_lex.$lgPOS -e Data/Analogies/an.$lgPOS

About

Unsupervised morphological analysis of words in a lexicon given a raw corpus in which they appear

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages