GitHub

This project will contain important translation data for Russian-Tuvan and reverse translations.

About the data:

This data was collected via www.tyvan.ru platform by linguists, scientists, journalists, volunteers, etc.

Folder Data:

The 50K file has a breakdown: training/validation/test data.

The validation and test sentences from the file are reflected at the end

Folder For Yandex:

The datasets with 306615 translations.

Dataset Structure

The dataset contains Tyvan-Russian paires.

Data row has the following fields:

tyv: str: text in Tuvan
ru: str: text in Russian (translate)

Dataset Details

Dataset Description

Curated by: Ali Kuzhuget (tech and data), Ondar Choygan (data) contributors
Language(s) (NLP): Tyvan (Tuvan), Russian
License:: CC BY 4.0.

Below is the brief information about the languages

Language	Language code on the website	ISO 639-3	Glottolog
Tyvan	`tyv`	`tyv`	`tuvi1240`
Russian	`rus`	`rus`	`russ1263`

Dataset Sources

The dataset has been downloaded from www.tyvan.ru.

Uses

The dataset is intended to help humans and machines learn the low-resourced Tyvan (Tuvan) and Russian languages.

Dataset Creation

The dataset was curates as a source of machine translation training and other NLP tools. It consists donated and professional translations from books and websites. They have been downloaded from the www.tyvan.ru website and fined by Ali Kuzhuget. No additional filtering or postprocessing has been applied.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Data		Data
For Yandex		For Yandex
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About the data:

Folder Data:

Folder For Yandex:

Dataset Structure

Dataset Details

Dataset Description

Dataset Sources

Uses

Dataset Creation

About

Uh oh!

Releases

Packages

License

Agisight/TyvaData

Folders and files

Latest commit

History

Repository files navigation

About the data:

Folder Data:

Folder For Yandex:

Dataset Structure

Dataset Details

Dataset Description

Dataset Sources

Uses

Dataset Creation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages