Skip to content

Jiangyan Zhang, Deji Kazhuo, Luosang Gadeng, Nyima Trashi, and Nuo Qun. 2022. Research and Application of Tibetan Pre-training Language Model Based on BERT. In Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics (ICCIR '22). Association for Computing Machinery, New York, NY, USA, 519–524.

Notifications You must be signed in to change notification settings

UTibetNLP/Tibetan_BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Tibetan BERT Model

Introduction

You can find our Tibetan BERT model on Hugging Face (UTibetNLP/tibetan_bert).

Citation

Please cite our paper if you use this model:

@inproceedings{10.1145/3548608.3559255,
author = {Zhang, Jiangyan and Kazhuo, Deji and Gadeng, Luosang and Trashi, Nyima and Qun, Nuo},
title = {Research and Application of Tibetan Pre-Training Language Model Based on BERT},
year = {2022},
isbn = {9781450397179},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3548608.3559255},
doi = {10.1145/3548608.3559255},
abstract = {In recent years, pre-training language models have been widely used in the field of natural language processing, but the research on Tibetan pre-training language models is still in the exploratory stage. To promote the further development of Tibetan natural language processing and effectively solve the problem of the scarcity of Tibetan annotation data sets, the article studies the Tibetan pre-training language model based on BERT. First, given the characteristics of the Tibetan language, we constructed a data set for the BERT pre-training language model and downstream text classification tasks. Secondly, construct a small-scale Tibetan BERT pre-training language model to train it. Finally, the performance of the model was verified through the downstream task of Tibetan text classification, and an accuracy rate of 86\% was achieved on the task of text classification. Experiments show that the model we built has a significant effect on the task of Tibetan text classification.},
booktitle = {Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics},
pages = {519–524},
numpages = {6},
location = {Nanjing, China},
series = {ICCIR '22}
}

About

Jiangyan Zhang, Deji Kazhuo, Luosang Gadeng, Nyima Trashi, and Nuo Qun. 2022. Research and Application of Tibetan Pre-training Language Model Based on BERT. In Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics (ICCIR '22). Association for Computing Machinery, New York, NY, USA, 519–524.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published