dialect-identification

Here are 13 public repositories matching this topic...

CAMeL-Lab / camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

nlp sentiment-analysis named-entity-recognition nlp-apis arabic nlp-library pos-tagging morphological-analysis stemming arabic-dialects dialect-identification morphological-generation morphological-disambiguation morphological-reinflection

Updated Mar 5, 2026
Python

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset. TunBERT was applied to three NLP downstream tasks: Sentiment Analysis (SA), Tunisian Dialect Identification (TDI) and Reading Comprehension Question-Answering (RCQA)

nlp sentiment-analysis question-answering dialect-identification bert-models

Updated Feb 13, 2023
Python

sinaahmadi / CORDI

Sponsor

Star

Language and Speech Technology for Central Kurdish Varieties (LREC-COLING 2024)

machine-translation automatic-speech-recognition kurdish sorani language-identification dialect-identification kurdish-language-processing erbil sulaymaniyah mahabad sanandaj

Updated Nov 29, 2024
Python

AlexYangLi / DMT

Star

VarDial19 shared task: Discriminating between Mainland and Taiwan Variation of Mandarin Chinese (DMT)

dialect mandarin dialect-identification mandarin-chinese

Updated Apr 10, 2019
Python

a-coles / SMS-Stylometry

Star

A tool that predicts the dialect of English of an SMS message using recurrent neural networks supplemented with data from Google Trends.

sms rnn stylometry google-trends authorship-identification dialect-identification location-detection

Updated Dec 19, 2017
Python

Cyr-Ch / german-dialect-aware-g2p

Star

Dialect-aware grapheme-to-phoneme conversion for German using Transformer + XLM-R. Context-aware, multi-dialect support with CTC+CE training. Built with PyTorch Lightning & Hydra.

transformer grapheme-to-phoneme dialect-identification