schwa-deletion

Machine learning models for schwa deletion in Hindi and Punjabi.

Pre-generated models, which achieve state-of-the-art performance, using scikit-learn's MLPClassifier and LogisticRegression, as well as XGBoost's XGBClassifier are included in the models subfolder in each language's directory.

The results of this research are presented in the paper below:

"Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi", Aryaman Arora, Luke Gessler, and Nathan Schneider (2020). In Proceedings of ACL. Preprint: https://arxiv.org/abs/2004.10353

Usage

Ensure that you are using the most recent Python 3 version.

Clone repo and install requirements:

git clone https://github.com/aryamanarora/schwa-deletion.git
cd schwa-deletion
pip install -r requirements.txt

Testing the pretrained Hindi XGBoost model:

cd hindi
python test.py

You can see test.py for an idea of how to use the main.py script as a module.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
hindi		hindi
presentation		presentation
punjabi		punjabi
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

schwa-deletion

Usage

About

Releases

Packages

Contributors 4

Languages

aryamanarora/schwa-deletion

Folders and files

Latest commit

History

Repository files navigation

schwa-deletion

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages