Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



10 Commits

Repository files navigation


The Hub of Computational Argumentation in the Era of LLM, where you can find surveys, papers, datasets, benchmarks, and evaluations of commonly used LLMs on computational Argumentation tasks.

Table of Contents



Benchmark & datasets

Date Paper Publication
2024-08 DebateQA: Evaluating Question Answering on Debatable Knowledge Arxiv
2024-06 Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks EPIA 2023
2024-06 Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation ACL 2024
2024-06 OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset ACL 2024
2024-02 Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements ACL 2024
2023-12 FREDSum: A Dialogue Summarization Corpus for French Political Debates EMNLP2023
2023-11 Automatic Analysis of Substantiation in Scientific Peer Reviews EMNLP 2023
2022-06 QT30: A Corpus of Argument and Conflict in Broadcast Debate ACL 2022
2022-04 Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions ACL 2022
2022-03 IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks ACL
2021-02 SummEval: Re-evaluating Summarization Evaluation tacl
2021-01 Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale EACL 2021
2020-12 Transformer-Based Argument Mining for Healthcare Applications ECAI 2020
2020-10 Detecting Attackable Sentences in Arguments EMNLP 2020
2020-10 Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains EMNLP
2020-06 Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing COLING
2020-05 USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation ACL 2020
2019-09 A Dataset of General-Purpose Rebuttal EMNLP 2019
2019-06 Exploring the Role of Prior Beliefs for Argument Persuasion ACL 2018
2019-06 A Corpus for Modeling User and Language Effects in Argumentation on Online Debating ACL
2018-02 Cross-topic Argument Mining from Heterogeneous Sources Using Attention-based Neural Networks EMNLP
2017-04 Recognizing Insufficiently Supported Arguments in Argumentative Essays EACL
2016-04 Parsing Argumentation Structures in Persuasive Essays Computational Linguistics


Date Paper Publication
2023-11 Exploring the Potential of Large Language Models in Computational Argumentation ACL


Argument Mining

Date Paper Publication
2024-07 Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques ACL 2024
2024-06 In-Context Learning and Fine-Tuning GPT for Argument Mining Arxiv
2024-05 WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining ASONAM
2024-05 DMON: A Simple yet Effective Approach for Argument Structure Learning COLING 2024
2024-04 Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning NAACL 2024
2024-04 A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality NAACL 2024
2024-04 TACO -- Twitter Arguments from COnversations Arxiv
2024-02 Can Large Language Models perform Relation-based Argument Mining? ACL 2024
2024-01 End-to-End Argument Mining over Varying Rhetorical Structures Arxiv
2023-12 Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining EMNLP 2023
2023-10 Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining EMNLP
2023-10 TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining EMNLP 2023
2023-06 Detecting Check-Worthy Claims in Political Debates, Speeches, and Interviews Using Audio Data ICASSP
2023-05 AQE: Argument Quadruplet Extraction via a Quad-Tagging Augmented Generative Approach ACL 2023
2023-02 VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining EMNLP 2023
2022-09 Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition COLING 2022
2022-09 ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining COLING
2022-05 A Holistic Framework for Analyzing the COVID-19 Vaccine Debate NAACL 2022
2022-04 Echoes through Time: Evolution of the Italian COVID-19 Vaccination Debate AAAI
2022-03 Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining? ACL 2022

Argument Generation

Date Paper Publication
2024-06 Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking Arxiv
2023-12 Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation EMNLP 2023
2023-10 From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models EMNLP 2023
2023-09 Claim Optimization in Computational Argumentation INLG 2023
2023-07 DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs EMNLP 2023
2023-01 Conclusion-based Counter-Argument Generation eacl-23
2022-10 MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective EMNLP 2022
2022-05 RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators NAACL 2022
2022-03 The Moral Debater: A Study on the Computational Generation of Morally Framed Arguments ACL 2022

Quality Assessment

Date Paper Publication
2024-06 Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks EPIA 2023
2024-04 Can Language Models Recognize Convincing Arguments? Arxiv
2024-03 Argument Quality Assessment in the Age of Instruction-Following Large Language Models COLING 2024
2023-11 Automatic Analysis of Substantiation in Scientific Peer Reviews EMNLP 2023
2023-05 Contextualizing Argument Quality Assessment with Relevant Knowledge NAACL 2024
2023-01 Conclusion-based Counter-Argument Generation eacl-23
2022-12 Claim Optimization in Computational Argumentation INLG 2023
2022-03 Automatic Debate Evaluation with Argumentation Semantics and Natural Language Argument Graph Networks EMNLP 2023
2021-10 Assessing the Sufficiency of Arguments through Conclusion Generation EMNLP 2021
2020-12 Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing COLING 20
2020-10 Exploring the Role of Argument Structure in Online Debate Persuasion EMNLP 2020
2019-09 A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis AAAI 2020
2019-09 Automatic Argument Quality Assessment -- New Datasets and Methods EMNLP 2019

Debate For LLM

Date Paper Publication
2024-08 Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate Arxiv
2024-06 An Empirical Analysis on Large Language Models in Debate Evaluation ACL 2024
2024-05 DEBATE: Devil's Advocate-Based Assessment and Text Evaluation Arxiv
2024-03 Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM ACL 2024
2024-03 A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning ACM Multimedia
2024-02 Debating with More Persuasive LLMs Leads to More Truthful Answers ICML 2024
2024-02 Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements ACL 2024
2024-01 Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate Arxiv
2024-01 Combating Adversarial Attacks with Multi-Agent Debate Arxiv
2024-01 Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models ACM Web
2023-12 Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System Arxiv
2023-12 Recourse under Model Multiplicity via Argumentative Ensembling (Technical Report) AAMAS 2024
2023-11 Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs Arixv
2023-11 Debate Helps Supervise Unreliable Experts Arxiv
2023-11 Scalable AI Safety via Doubly-Efficient Debate Arxiv
2023-10 From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models EMNLP 2023
2023-10 Let Models Speak Ciphers: Multiagent Debate through Embeddings ICLR 2024
2023-08 ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate ICLR 2024
2023-05 Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate EMNLP 2023
2023-05 Improving Factuality and Reasoning in Language Models through Multiagent Debate ICML 2024
2023-05 Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate Arxiv
2023-05 Examining Inter-Consistency of Large Language Models Collaboration: An In-depth Analysis via Debate EMNLP 2023
2022-10 The Debate Over Understanding in AI's Large Language Models Arixv
2022-03 The Moral Debater: A Study on the Computational Generation of Morally Framed Arguments ACL 2022
2021-10 Project Debater APIs: Decomposing the AI Grand Challenge EMNLP 2021



The Hub of Computational Argumentation in the Era of LLM






No releases published


No packages published