Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture

This is the github repo for our ACL 2023 paper "Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture" (Link).

Pre-trained Checkpoint

The original checkpoint is deprecated due to cluster shutdown. The author quickly re-trained a small model at this link: BERT-base/large Models. The link is updated on August 2024.

This model can replicate similar performance on SemEval 2012 and Simplicity-DA reported in the original paper.

Do email me if you need a higher-performance model.

Usage

The ranker is in comparative_complexity.py. The rank takes two tokenized words/phrases as the input and outputs a score from -1 to 1. As said in Table 1 of the paper, it denotes complicating->simplifying.

An example is as follows:

ranker = torch.load("./bert-base-all.ckpt")

pairs = [["currently", "now"], ["resolve", "solve"], ["phones", "telephones"]]

for pair in pairs:
    uids = tokenizer.encode(pair[1], add_special_tokens=True, return_tensors='pt').to(device)
    cids = tokenizer.encode(pair[0], add_special_tokens=True, return_tensors='pt').to(device)        

    prediction = ranker(cids, uids)
    print(prediction.cpu().detach().tolist())

The whole pipeline to get BETS is in metric.py, which combines the P_simp and R_meaning scores.

The input for the pipeline with P-simp/R-meaning is ./dataset/wikilarge.json (a list of sentences to be simplified) and ./dataset/final_output.json (a dictionary of lists of simplification system outputs).

Citation

  @inproceedings{zhao-etal-2023-towards,
      title = "Towards Reference-free Text Simplification Evaluation with a {BERT} {S}iamese Network Architecture",
      author = "Zhao, Xinran  and
        Durmus, Esin  and
        Yeung, Dit-Yan",
      editor = "Rogers, Anna  and
        Boyd-Graber, Jordan  and
        Okazaki, Naoaki",
      booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
      month = jul,
      year = "2023",
      address = "Toronto, Canada",
      publisher = "Association for Computational Linguistics",
      url = "https://aclanthology.org/2023.findings-acl.838",
      doi = "10.18653/v1/2023.findings-acl.838",
      pages = "13250--13264",
  }

Others

If you have any other questions about this repo, you are welcome to open an issue or send me an email, I will respond to that as soon as possible.

Details about how to set up and run the code will be available soon.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dataset		dataset
LICENSE		LICENSE
README.md		README.md
baselines.py		baselines.py
comparative_complexity.py		comparative_complexity.py
metric.py		metric.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture

Pre-trained Checkpoint

Usage

Citation

Others

About

Uh oh!

Releases

Packages

Languages

License

colinzhaoust/reference-free_TS_evaluation

Folders and files

Latest commit

History

Repository files navigation

Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture

Pre-trained Checkpoint

Usage

Citation

Others

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages