-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
It seems like many of the best performing models on the GLUE benchmark make some use of multitask learning (simultaneous training on multiple tasks).
The T5 paper highlights multiple ways of mixing the tasks together during finetuning:
- Examples-proportional mixing - sample from tasks proportionally to their dataset size
- Equal mixing - sample uniformly from each task
- Temperature-scaled mixing - The generalized approach used by multilingual BERT which uses a temperature T, where the mixing rate of each task is raised to the power 1/T and renormalized. When T=1 this is equivalent to equal mixing, and becomes closer to equal mixing with increasing T.
Following this discussion huggingface/transformers#4340 in transformers, @enzoampil suggested that the nlp
library might be a better place for this functionality.
Some method for combining datasets could be implemented ,e.g.
dataset = nlp.load_multitask(['squad','imdb','cnn_dm'], temperature=2.0, ...)
We would need a few additions:
- Method of identifying the tasks - how can we support adding a string to each task as an identifier: e.g. 'summarisation: '?
- Method of combining the metrics - a standard approach is to use the specific metric for each task and add them together for a combined score.
It would be great to support common use cases such as pretraining on the GLUE benchmark before fine-tuning on each GLUE task in turn.
I'm willing to write bits/most of this I just need some guidance on the interface and other library details so I can integrate it properly.