FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations (NAACL 2022)

This repository contains the code for the paper "FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations".

FactGraph is an adapter-based method for assessing factuality that decomposes the document and the summary into structured meaning representations (MR):

In FactGraph, summary and document graphs are encoded by a graph encoder with structure-aware adapters, along with text representations using an adapter-based text encoder. Text and graph encoders use the same pretrained model and only the adapters are trained:

Environment

The easiest way to proceed is to create a conda environment:

conda create -n factgraph python=3.7
conda activate factgraph

Further, install PyTorch and PyTorch Geometric:

pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter==2.0.9 -f https://data.pyg.org/whl/torch-1.9.0+cu111.html
pip install torch-sparse==0.6.12 -f https://data.pyg.org/whl/torch-1.9.0+cu111.html
pip install torch-geometric==2.0.3

Install the packages required:

pip install -r requirements.txt

Finally, create the environment for AMR preprocessing:

cd data/preprocess
./create_envs_preprocess.sh
cd ../../

FactCollect Dataset

FactCollect is created consolidating the following datasets:

Dataset	Datapoints
Wang et al. (2020)	953	Link
Kryscinski et al. (2020)	1434	Link
Maynez et al. (2020)	2500	Link
Pagnoni et al. (2021)	4942	Link

For generating FactCollect dataset, execute:

conda activate factgraph
cd data
./create_dataset.sh
cd ..

Training FactGraph

Preprocess

Convert the dataset into the format required for the model:

cd data/preprocess
./process_dataset_for_model.sh <gpu_id>
cd ../../

This step generated AMR graphs using the SPRING model. Check their repository for more details.

Training

For training FactGraph using the FactCollect dataset, execute:

conda activate factgraph
./train.sh <gpu_id>

Predicting

For predicting, run:

./predict.sh <checkpoint_folder> <gpu_id>

Training FactGraph - Edge-level

Preprocess

Download the files train.tsv and test.tsv from this link provided by Goyal and Durrett (2021). Copy those files to data\edge_level_data

Convert the dataset into the format required for the model:

cd data/preprocess
./process_dataset_for_edge_model.sh <gpu_id>
cd ../../

Training

For training FactGraph using the FactCollect dataset, execute:

conda activate factgraph
./train_edgelevel.sh <gpu_id>

Predicting

For predicting, run:

./predict_edgelevel.sh <checkpoint_folder> <gpu_id>

Security

See CONTRIBUTING for more information.

License Summary

The documentation is made available under the Creative Commons Attribution-ShareAlike 4.0 International License. See the LICENSE file.

The sample code within this documentation is made available under the MIT-0 license. See the LICENSE-SAMPLECODE file.

Citation

@inproceedings{ribeiro-etal-2022-factgraph,
    title = "FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations",
    author = "Ribeiro, Leonardo F. R.  and
      Liu, Mengwen  and
      Gurevych, Iryna and
      Dreyer Markus and
      Bansal, Mohit",
      booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
      year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
data		data
images		images
src		src
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE-SAMPLECODE		LICENSE-SAMPLECODE
LICENSE-SUMMARY		LICENSE-SUMMARY
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations (NAACL 2022)

Environment

FactCollect Dataset

Training FactGraph

Preprocess

Training

Predicting

Training FactGraph - Edge-level

Preprocess

Training

Predicting

Security

License Summary

Citation

About

Releases

Packages

Languages

License

osmalpkoras/fact-graph

Folders and files

Latest commit

History

Repository files navigation

FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations (NAACL 2022)

Environment

FactCollect Dataset

Training FactGraph

Preprocess

Training

Predicting

Training FactGraph - Edge-level

Preprocess

Training

Predicting

Security

License Summary

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages