Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
-
Updated
May 9, 2024 - Python
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
🤖 A PyTorch library of curated Transformer models and their composable components
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
Unattended Lightweight Text Classifiers with LLM Embeddings
Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity
Resources and tools for the Tutorial - "Hate speech detection, mitigation and beyond" presented at ICWSM 2021
An implementation of drophead regularization for pytorch transformers
Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models
Official repository of the ACL 2024 paper "Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!".
Fine-tuning Question Answering models on German with the GermanQuAD dataset
[KDD 2024] Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning
Our source code for EACL2021 workshop: Offensive Language Identification in Dravidian Languages. We ranked 4th, 4th and 3rd in Tamil, Malayalam and Kannada language of this task finally!🥳
⚡ The system extracts answers from a given context
This repository contains a number of experiments with Multi Lingual Transformer models (Multi-Lingual BERT, DistilBERT, XLM-RoBERTa, mT5 and ByT5) focussed on the Dutch language.
Can Cross-domain Term Extraction Benefit from Cross-lingual Transfer?
Add a description, image, and links to the xlm-roberta topic page so that developers can more easily learn about it.
To associate your repository with the xlm-roberta topic, visit your repo's landing page and select "manage topics."