Bitextor generates translation memories from multilingual websites
-
Updated
Nov 11, 2024 - Python
Bitextor generates translation memories from multilingual websites
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
The Business Scene Dialogue corpus
Curated list of publicly available parallel corpus for Indian Languages
Yet another search platform for linguistic corpora.
OPUS (opus.nlpl.eu) Python3 API
A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models
AMI Meeting Parallel Corpus
A simple and efficient tool for mining and aligning sentences with pre-trained models.
Repository of supplementary materials and RStudio project for the paper on corpus-based approach to measuring constructional equivalence.
A program for calculating corpora alignments using a pivot language
Word-alignment models for Bible translations in 100+ historical and contemporary languages
Cod hwyluso alinio testunau gyda hunalign a dogfennaeth ar sut i ddefnyddio LFAligner // Code for simplifying aligning texts with hunalign and documentation for LFAligner
1990000-Groups-Chinese-Czech-Parallel-Corpus-Data
Repository kode pemrograman R dan data untuk analisis dalam penelitian dengan judul MODEL KAJIAN TERJEMAHAN BERBASIS BANK DATA TERJEMAHAN DIGITAL INGGRIS-INDONESIA DAN IMPLIKASI PEDAGOGISNYA
Add a description, image, and links to the parallel-corpora topic page so that developers can more easily learn about it.
To associate your repository with the parallel-corpora topic, visit your repo's landing page and select "manage topics."