Name	Name	Last commit message	Last commit date
Latest commit History 3 Commits
SciTLDR-Data	SciTLDR-Data
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
evaluate.py	evaluate.py
requirements.txt	requirements.txt

Name

Last commit message

Last commit date

SciTLDR-Data

SciTLDR

This repository contains the dataset, model weights, and generation code for our paper "TLDR: Extreme Summarization of Scientific Documents".

Demo

A running demo of our model can be found here.

Requirements

We use Fairseq to train and evaluate our models. To install all requirements, run pip install -r requirements.txt

For the evaluation, you will need files2rouge. Please install my fork of the repo.

Model Weights

bart.large.xsum.multitask-A

bart.large.xsum.multitask-AIC

Data Preprocessing

In order to format the data to work for the Fairseq library, run:

$ cd SciTLDR-Data
$ export TASK=SciTLDR-A # Choose from {A, AIC, FullText}
$ python to_stories.py $TASK # Convert to story format
$ chmod +x make_datafiles.sh
$ ./make_datafiles.sh # BPE preprocess

Evaluation

This code takes in a test.source file, in which each line is an input and outputs a test.hypo file with the predictions. It imports a test.jsonl file as a reference and stores the rouge score in test.hypo.score.

$ python evaluate.py SciTLDR-Data/SciTLDR-A /path/to/model/dir/ --checkpoint_file scitldr_ao_model.pt --beam 4 --lenpen 0.6

OR

$ python evaluate.py SciTLDR-Data/SciTLDR-AIC /path/to/model/dir/ --checkpoint_file scitldr_aic_model.pt --beam 2 --lenpen 0.2

Citing

If you use our code, dataset, or model weights in your research, please cite "TLDR: Extreme Summarization of Scientific Documents."

@article{cachola2020tldr,
  title={{TLDR}: Extreme Summarization of Scientific Documents},
  author={Isabel Cachola and Kyle Lo and Arman Cohan and Daniel S. Weld},
  journal={arXiv:2004.15011},
  year={2020},
}

SciTLDR is an open-source project developed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SciTLDR

Demo

Requirements

Model Weights

Data Preprocessing

Evaluation

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

allenai/scitldr

Folders and files

Latest commit

History

Repository files navigation

SciTLDR

Demo

Requirements

Model Weights

Data Preprocessing

Evaluation

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages