Skip to content

This is the github repo for our ACL 2023 paper "Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture".

License

Notifications You must be signed in to change notification settings

colinzhaoust/reference-free_TS_evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture

This is the github repo for our ACL 2023 paper "Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture" (Link).

Pre-trained Checkpoint

The original checkpoint is deprecated due to cluster shutdown. The author quickly re-trained a small model at this link: BERT-base/large Models. The link is updated on August 2024.

This model can replicate similar performance on SemEval 2012 and Simplicity-DA reported in the original paper.

Do email me if you need a higher-performance model.

Usage

The ranker is in comparative_complexity.py. The rank takes two tokenized words/phrases as the input and outputs a score from -1 to 1. As said in Table 1 of the paper, it denotes complicating->simplifying.

An example is as follows:

ranker = torch.load("./bert-base-all.ckpt")

pairs = [["currently", "now"], ["resolve", "solve"], ["phones", "telephones"]]

for pair in pairs:
    uids = tokenizer.encode(pair[1], add_special_tokens=True, return_tensors='pt').to(device)
    cids = tokenizer.encode(pair[0], add_special_tokens=True, return_tensors='pt').to(device)        

    prediction = ranker(cids, uids)
    print(prediction.cpu().detach().tolist())

The whole pipeline to get BETS is in metric.py, which combines the P_simp and R_meaning scores.

The input for the pipeline with P-simp/R-meaning is ./dataset/wikilarge.json (a list of sentences to be simplified) and ./dataset/final_output.json (a dictionary of lists of simplification system outputs).

Citation

  @inproceedings{zhao-etal-2023-towards,
      title = "Towards Reference-free Text Simplification Evaluation with a {BERT} {S}iamese Network Architecture",
      author = "Zhao, Xinran  and
        Durmus, Esin  and
        Yeung, Dit-Yan",
      editor = "Rogers, Anna  and
        Boyd-Graber, Jordan  and
        Okazaki, Naoaki",
      booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
      month = jul,
      year = "2023",
      address = "Toronto, Canada",
      publisher = "Association for Computational Linguistics",
      url = "https://aclanthology.org/2023.findings-acl.838",
      doi = "10.18653/v1/2023.findings-acl.838",
      pages = "13250--13264",
  }

Others

If you have any other questions about this repo, you are welcome to open an issue or send me an email, I will respond to that as soon as possible.

Details about how to set up and run the code will be available soon.

About

This is the github repo for our ACL 2023 paper "Towards Reference-free Text Simplification Evaluation with a BERT Siamese Network Architecture".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages