Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 696 Bytes

File metadata and controls

21 lines (17 loc) · 696 Bytes

380000-Groups-Uighur-Chinese-Parallel-Corpus-Data

Description

Uighur language and its parallel corresponding Chinese text data, 38,000 groups in total. They been cleaned, desensitized and gone through quality check. It can be used as base corpus for text data analysis in machine translation and related fields.

For more details, please refer to the link: https://www.nexdata.ai/datasets/nlu/194?source=Github

Specifications

Storage format

TXT

Data content

Uighur-Chinese Parallel Corpus Data

Data size

0.38 million pairs of Uighur-Chinese Parallel Corpus Data

Language

Uighur, Chinese

Application scenario

machine translation

Licensing Information

Commercial License