Releases: CAMeL-Lab/arabic-gec
Releases · CAMeL-Lab/arabic-gec
arabic-gec
Release of the morphological database used for morphological preprocessing:
calima-msa-s31_0.4.2.db.muddled
: a muddled version of the extended Standard Arabic Morphological Analyzer database (SAMA). This file has to be unmuddled before it can be used to reproduce our results. To unmuddle this file, you would first need to obtain the original SAMA 3.1 database from the LDC along with muddler. Once you do that, you'd run the following command:
muddler unmuddle -s LDC2010L01.tgz -m calima-msa-s31_0.4.2.db.muddled calima-msa-s31_0.4.2.db
. This will produce our extended database which we use in our experiments (i.e.,calima-msa-s31_0.4.2.db
).