Skip to content

Fair quantitative comparison of NLP embeddings from GloVe to RoBERTa with Sequential Bayesian Optimization fine-tuning using Flair and SentEval. Extension of HyperOpt library to log_b priors.

Notifications You must be signed in to change notification settings

EsterHlav/Quantitative-Comparison-NLP-Embeddings-from-GloVe-to-RoBERTa

Repository files navigation

Embeddings in Natural Language Processing

Quantitative Comparison on Downstream Tasks

Ester Hlav, 2019

Leveraging the power of PyTorch and open source libraries Flair and SentEval, we perform a fair quantitative comparison of natural language processing (NLP) embeddings -- from classical, contextualized architectures like Word2Vec and GloVe to recent state-of-the-art transformer-based embeddings such as BERT and RoBERTa. We aim to benchmark them on downstream tasks by adding a top layer fine-tuning, while allowing for a certain architecture flexibility for each specific embedding. Using Sequential Bayesian Optimization, we fine-tune our models to achieve optimal perfomance and implement an extension of a hyperparameter optimization library (Hyperopt) to correct for inconsistences in sampling log base 2 and 10 uniform distribution for batch size and learning rate for training. Our results report transformer-based architectures (BERT, RoBERTa) to be the leading models on all downstream NLP tasks used.

Project includes a report with detailed description of embeddings, empirical tests and results, as well as a presentation split in two parts. Part 1 is a general overview of NLP Embeddings: How to represent textual input for neural networks; Part 2 reports empirical results of the quantitative embeddings comparison conducted with PyTorch and libraries SentEval, Flair and Hyperopt, on some of the most common NLP tasks (e.g. sentiment analysis, question answering, etc.).

Part1: How to Represent Text for Neural Networks

 

The following animations provide visual representations of diffrences in sequence-to-sequence (seq2seq) architectures for specific embeddings:

Seq2seq Encoder-Decoder

Seq2seq bi-directional Encoder-Decoder with Attention -- CoVe, ELMo

Seq2seq Transformer -- BERT, XLM, RoBERTA

In Transformer animations, each appearing element represents a key/value, and the element with the largest presence represents the query element.

Seq2seq Transformer-XL -- XLNet

 

Part2: Quantitative Comparison on Downstream Tasks

Results -- for Accurracy, Tasks and Architectures

 

Code

Dependencies

The code runs as an extension of the following libraries:

  • SentEval from FAIR (modified version)
  • Flair from Zalando Research

Running the code

Install the dependencies.

pip install flair segtok bpemb deprecated pytorch_transformers allennlp

Run the bash script to get the data for downstream tasks.

data/downstream/get_transfer_data.bash

Launch the jupyter notebook Embeddings_benchmark.ipynb for training.

About

Fair quantitative comparison of NLP embeddings from GloVe to RoBERTa with Sequential Bayesian Optimization fine-tuning using Flair and SentEval. Extension of HyperOpt library to log_b priors.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published