The following is a collections of work related to free-text natural language explanations, also called abstractive rationales.
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing
Local Interpretations for Explainable Natural Language Processing: A Survey
Interpreting Deep Learning Models in Natural Language Processing: A Review
WT5?! Training Text-to-Text Models to Explain their Predictions https://arxiv.org/pdf/2004.14546.pdf
Generating Counterfactual Explanations with Natural Language
Explaining Question Answering Models through Text Generation
Unsupervised Commonsense Question Answering with Self-Talk
Rationale-Inspired Natural Language Explanations with Commonsense
Exploiting Rationale Data for Explainable NLP Models
Few-Shot Self-Rationalization with Natural Language Prompts
Cross-Domain Transfer of Generative Explanations Using Text-to-Text Models
Prompting Contrastive Explanations for Commonsense Reasoning Task
Story Generation with Commonsense Knowledge Graphs and Axioms
NILE : Natural Language Inference with Faithful Natural Language Explanations
LIREx: Augmenting Language Inference with Relevant Explanation
Generate Natural Language Explanations for Recommendation
e-SNLI: Natural Language Inference with Natural Language Explanations
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
QED: A Framework and Dataset for Explanations in Question Answering
ERASER: A Benchmark to Evaluate Rationalized NLP Models
A Study of Automatic Metrics for the Evaluation of Natural Language Explanations
Explainable Natural Language Processing, Anders Søgaard
Explainability for Natural Language Processing Lecture-style Tutorial at SIGKDD 2021