iFᴀᴄᴇᴛSᴜᴍ

iFᴀᴄᴇᴛSᴜᴍ is an interactive faceted summarization approach and system for navigating within a large document-set on a topic.

Paper 📄 https://arxiv.org/pdf/2109.11621.pdf (Proceedings of EMNLP 2021, System Demonstrations)
Demo 🤩 https://nlp.biu.ac.il/~hirsche5/ifacetsum/

Development

How to run

First, git clone the project.

Set up the server

Run pip install -r requirements.txt
Run python -m spacy download en_core_web_md
From inside python, run import nltk and then nltk.download('punkt')
Run python WebApp/server/app.py

Set up the client (node)

Run cd WebApp/client
Run npm install
Run npm start
Open the url http://localhost:3000

How to work with DUC2006 data

You should request access for DUC2006Clean from https://duc.nist.gov/ and place it inside the data/ directory.

How to add your own data

Change Config.py to point to your data directory, including the text files and the cluster files (either json or conll format).

How to create your own clusters

To support reproducibility efforts and adding custom document-sets, all models used were released and available online.

CD Event Co-reference Alignment

Create event mentions using the models and scripts in https://github.com/ariecattan/event_extractor.
Create pairwise mention scores and clusters using CDLM https://github.com/aviclu/CDLM.
Use agglomerative clustering to combine mentions into clusters.

CD Entities Co-reference Alignment

For the end-to-end iFᴀᴄᴇᴛSᴜᴍ entities script (following above instructions) refer to https://github.com/AlonEirew/wd-plus-srl-extraction#wec-cd-coreference

Create entities mentions using SpanBert, accessible from https://docs.allennlp.org/models/main/.
Use the WEC model to score each pairwise.
Use agglomerative clustering to combine WD and CD mentions into clusters.

Proposition Alignment

Please refer to https://github.com/oriern/SuperPAL for instructions of extracting propositions using OIE and extracting pairwise scores.
iFᴀᴄᴇᴛSᴜᴍ's code takes care of converting the pairwise CSV from SuperPAL into clusters.

Citation:

If you find our work useful, please cite the paper as:

@article{hirsch2021ifacetsum,
  title={iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration},
  author={Hirsch, Eran and Eirew, Alon and Shapira, Ori and Caciularu, Avi and Cattan, Arie and Ernst, Ori and Pasunuru, Ramakanth and Ronen, Hadar and Bansal, Mohit and Dagan, Ido},
  journal={Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
QFSE		QFSE
WebApp		WebApp
data		data
tests		tests
.gitignore		.gitignore
README.md		README.md
demo.txt		demo.txt
iFacetSum.gif		iFacetSum.gif
requirements.txt		requirements.txt
test-requirements.txt		test-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iFᴀᴄᴇᴛSᴜᴍ

Development

How to run

Set up the server

Set up the client (node)

How to work with DUC2006 data

How to add your own data

How to create your own clusters

CD Event Co-reference Alignment

CD Entities Co-reference Alignment

Proposition Alignment

Citation:

About

Releases 1

Packages

Contributors 3

Languages

BIU-NLP/iFACETSUM

Folders and files

Latest commit

History

Repository files navigation

iFᴀᴄᴇᴛSᴜᴍ

Development

How to run

Set up the server

Set up the client (node)

How to work with DUC2006 data

How to add your own data

How to create your own clusters

CD Event Co-reference Alignment

CD Entities Co-reference Alignment

Proposition Alignment

Citation:

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages