Data augmentation for NLP, presented at EMNLP 2019
-
Updated
Mar 19, 2023 - Python
Data augmentation for NLP, presented at EMNLP 2019
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
📃Language Model based sentences scoring library
ICLR 2018 Quick-Thought vectors
Extract Information from web corpus using Open Information Extraction.
Code for KaLM-Embedding models
Tensorflow Implementation of Variational Attention for Sequence to Sequence Models (COLING 2018)
A sentence segmentation library with wide language support optimized for speed and utility.
A web application that interfaces two GEC systems. [web instance is down]
Tensorflow Implementation of Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation (NAACL 2019).
A program that can generate a secure password of up to 100 characters, extract securely selected words from the diceware wordlist, generate a password from a sentence, and check for vulnerabilities in a given password.
63k Chinese sentences with simplified, traditional, pinyin and english translation for offline use
word2vec with a context based on sentences.
A Quick Thought implemented by pytorch.
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !
Add a description, image, and links to the sentence topic page so that developers can more easily learn about it.
To associate your repository with the sentence topic, visit your repo's landing page and select "manage topics."