Python implementation of multiple text as data methods and benchmarking on publicly available datasets. Below a brief description of the data and the main methods demonstrated.
For gigantic corpora and interesting alternative to the implementation from this repository is flashtext
Zero-shot and no fine-tuning.
Llama 3