Skip to content

[ACL'21 Findings] Why Machine Reading Comprehension Models Learn Shortcuts?

Notifications You must be signed in to change notification settings

luciusssss/why-learn-shortcut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Why Machine Reading Comprehension Models Learn Shortcuts?

Repo for 'Why Machine Reading Comprehension Models Learn Shortcuts?', Findings of ACL 2021

Arxiv Preprint: https://arxiv.org/abs/2106.01024

Data

The synthetic datasets proposed in our paper are in the dataset.zip file.

Size

Train (Pairs) Dev (Pairs)
Question Word Matching 6306 766
Simple Matching 7562 952

Format

The format of each dataset is as follows (same as the format of SQuAD). The version of the question (challenging or shortcut) is denoted in the key qtype.

{
  "version": "simple_matching_dev_challenging",  		// dataset version
  "data": [
    {
      "paragraphs": [
        {
          "cid": "squad1.1-dev-1_h", 				// context id
          "context": "As this was the 50th Super Bowl...",
          "qas": [
            {
              "id": "56be8e613aeaaa14008c90d2_h", // question id
              "question": "What day was the game played on?",
              "answers": [
                {
                  "text": "February 7, 2016",
                  "answer_start": 505 		// the start position of the answer in the context
                }
              ],
              "qtype": "challenging" 			// question type (challenging/shortcut)
            }
          ]
        }
      ]
    }
  ]
}

About

[ACL'21 Findings] Why Machine Reading Comprehension Models Learn Shortcuts?

Resources

Stars

Watchers

Forks