-
AI Singapore
- Singapore
Dataset and Benchmark
Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence
Large-scale multi-document summarization dataset and code
Materials related to our Sinn und Bedeutung 23 paper
Evaluating Cross-lingual Sentence Representations
Dataset Catalogue Homepage for Indonesian Languages
Measuring Massive Multitask Language Understanding | ICLR 2021
Facebook Low Resource (FLoRes) MT Benchmark
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"
Library Genesis (libgen) db dumps mirror on ipfs
Large datasets for conversational AI
Bhinneka Korpus: A Collection of Multilingual Parallel Datasets for 5 Indonesian Local Languages
Proxy server to bypass Cloudflare protection
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.