Skip to content

Latest commit

 

History

History
45 lines (29 loc) · 3.33 KB

README.md

File metadata and controls

45 lines (29 loc) · 3.33 KB

PASRL

Proposition Acquisition with SRL

Why do this?

Co-occurrence-based methods have yielded very useful results for analyzing social science corpora. However, for a negotiation corpus, it is relevant to know not only what concepts or actors co-occur within a given text window, but also, where those actors stand with respect to each other and regarding those concepts. For instance:

  • Who's opposing whom?
  • Who's siding with whom?
  • About what issues?

A Natural Language Processing Pipeline (NLP) was applied to climate negotiation reports, from the Earth Negotiations Bulletin, which covers climate summits where treaties like the Kyoto Protocol or the Paris Agreement got negotiated.

Actors, their concerns, and their relation to other actors was identified based on the outputs of the NLP pipeline.

The results can be navigated on a user interface.

System Architecture

System Workflow Diagram

The app consists in two projects. A short description follows; see each project's directory for details:

  • proposition_extraction
    • Workflow to extract triples (propositions) for the speakers, their messages, and the predicate (reporting expression) relating both. An NLP pipeline, based on IXA Pipes, provides Semantic Role Labeling, dependency parsing and coreference chains. Based on the NLP output, propositions are identified with rules.
    • The proposition's messages are enriched with NLP-based metadata: keyphrases, generic entities from DBpedia, and domain-specific entities from a climate thesaurus.
    • Basic inference is performed to find actors who agree or disagree with other actors, and over which issues, based on the propositions and their NLP-based metadata.
    • The propositions and metadata are made navigable in the corpus_navigation project.
  • corpus_navigation
    • This project is a Django app to navigate the Earth Negotiations Bulletin corpus, enriched with propositions, i.e. triples of shape actor, predicate, message triples, and metadata extracted from the messages. All triples and metadata are made navigable.
    • Besides a structured search based on the proposition extraction workflow, full-text search with Solr is provided.

Publications